cd1168 0 Posted September 19, 2014 Report Share Posted September 19, 2014 (edited) Hi, I am trouble with a scraping script. trying to script a page for information, add the scrape information into a list.. the input of what to scrape comes from a ubot list..so to loop, i create a variable which has #totalrows and then a #row variable that tells ubot which row to insert and then scrape ..individually works fine. but when try to implement using thread and new browser, the threads new process does not increment #row, thus it scrapes the first #row for #totalrows times , i have read in other post: http://www.ubotstudio.com/forum/index.php?/topic/15985-insert-into-mysql-table/ and tried to follow , but still having problems here is my code, can anyone help , and find my bug. Edited September 22, 2014 by cd1168 Quote Link to post Share on other sites
UBotDev 276 Posted September 19, 2014 Report Share Posted September 19, 2014 By quickly checking the code I see 2 problems:First you need to move code "increment(#row)" out from "$scrapeFunction()", because you want to process the 2nd row as soon as 1st thread is started, therefor you need to execute increment right after the 1st thread is started, and not after a few other commands are executed.Second you need to pass #row into "$scrapeFunction()" as parameter, to ensure that it doesn't change in that scope (inside that function/thread). PS: High speed threading is also broken as proved here: http://www.ubotstudio.com/forum/index.php?/topic/15122-must-read-threading-doesnt-work-as-expected-tested-in-v4/ Quote Link to post Share on other sites
cd1168 0 Posted September 19, 2014 Author Report Share Posted September 19, 2014 yes but how do i accomplish, passing in the variable Quote Link to post Share on other sites
UBotDev 276 Posted September 19, 2014 Report Share Posted September 19, 2014 Just add it to "Parameters" when defining your function, so it looks like this: define $scrapeFunction(#row) { } Also make sure to check wiki: http://wiki.ubotstudio.com/wiki/Define Quote Link to post Share on other sites
cd1168 0 Posted September 19, 2014 Author Report Share Posted September 19, 2014 ok, thank you, let me try Quote Link to post Share on other sites
cd1168 0 Posted September 19, 2014 Author Report Share Posted September 19, 2014 (edited) still a problem...here is what i have.. it only reads the 1st member of the list. does it matter if define function is above or below in terms of location in script.. should it be below where i call it? again, thanks for your patience, actually i am quite new to ubot (2-3 days) and a little overwhelmed at the moment define $scrapeFunction(#row) { allow flash("No") allow css("No") allow images("No") set visibility("invisible") navigate("https://xxx.com/vin/xx", "Wait") wait for element(<name="id">, "", "Appear") wait for element(<id="recaptcha_challenge_image">, "", "Appear") if($both($exists(<id="recaptcha_challenge_image">), $exists(<name="id">))) { then { set(#id, $table cell(&idsToProcess, #row, 1), "Global") add item to list(%ids, #id, "Delete", "Global") type text(<name="id">, #id, "Standard") type text(<name="recaptcha_response_field">, $solve captcha(<id="recaptcha_challenge_image">), "Standard") click(<value="Submit">, "Left Click", "No") wait(1) set(#vinScraped, $scrape attribute(<tagname="section">, "innertext"), "Global") add item to list(%idScraped, #idScraped, "Delete", "Global") } } return(%idScraped)} set(#VintoProcessCnt, $table total rows(&idsToProcess), "Global")set(#row, 0, "Global")loop($table total rows(&idsToProcess)) { thread { in new browser { type text(<about me textarea>, $scrapeFunction(#row), "Standard") } increment(#row) } } Edited September 22, 2014 by cd1168 Quote Link to post Share on other sites
UBotDev 276 Posted September 19, 2014 Report Share Posted September 19, 2014 As I said, you need to increment #row outside the thread, not inside. Beside that you need another define command around the part where you spawn a new thread, in order for this to work . Here is an example, with your code stripped down with some code added: clear list(%rows) define $scrapeFunction(#row) { return(#row) } set(#row, 0, "Global") loop(10) { THREAD START(#row) increment(#row) wait(0.5) } define THREAD START(#row) { thread { add item to list(%rows, $scrapeFunction(#row), "Delete", "Global") wait(1) } } Here is a bit more advanced example similarly using THREAD START command: http://www.ubotstudio.com/forum/index.php?/topic/15441-free-plugin-threads-counter-ubot-v4-threading-fixed/ Quote Link to post Share on other sites
cd1168 0 Posted September 22, 2014 Author Report Share Posted September 22, 2014 hi, and thank you for the guidance, but still i have some bug somewhere.. it is not reading the correct row from the list that i need.. any leads to where this is i would appreciate define $scrapeFunction(#row) { allow flash("No") allow css("No") allow images("No") set visibility("invisible") navigate("https://vccp.com", "Wait") wait for element(<name="id">, "", "Appear") wait for element(<id="recaptcha_challenge_image">, "", "Appear") if($both($exists(<id="recaptcha_challenge_image">), $exists(<name="id">))) { then { set(#VIN, $table cell(&idsToProcess, #row, 1), "Global") add item to list(%ids, #id, "Delete", "Global") type text(<name="id">, #id, "Standard") type text(<name="recaptcha_response_field">, $solve captcha(<id="recaptcha_challenge_image">), "Standard") click(<value="Submit">, "Left Click", "No") wait(1) set(#idScraped, $scrape attribute(<tagname="section">, "innertext"), "Global") add item to list(%idScraped, #idScraped, "Don\'t Delete", "Global") } } return(#row)}define THREAD START(#row) { thread { add item to list(%rows, $scrapeFunction(%idScraped), "Delete", "Global") wait(1) }}set(#idtoProcessCnt, $table total rows(&idsToProcess), "Global")set(#row, 0, "Global")loop(#idtoProcessCnt) { in new browser { THREAD START(#row) increment(#row) wait(0.5) } }} Quote Link to post Share on other sites
UBotDev 276 Posted September 22, 2014 Report Share Posted September 22, 2014 You should pass #row to your $scrapeFunction, not a list %idScraped, right? Quote Link to post Share on other sites
cd1168 0 Posted September 22, 2014 Author Report Share Posted September 22, 2014 yes... but doesn't scrape anything .. no errors, just sits, doesn't finish .. comment("function to scrape") define $scrapeFunction(#row) { allow flash("No") allow css("No") allow images("No") set visibility("invisible") navigate("www.com", "Wait") wait for element(<name="id">, "", "Appear") wait for element(<id="recaptcha_challenge_image">, "", "Appear") if($both($exists(<id="recaptcha_challenge_image">), $exists(<name="id">))) { then { set(#id, $table cell(&idsToProcess, #row, 1), "Global") add item to list(%ids, #id, "Delete", "Global") type text(<name="id">, #id, "Standard") type text(<name="recaptcha_response_field">, $solve captcha(<id="recaptcha_challenge_image">), "Standard") click(<value="Submit">, "Left Click", "No") wait(1) set(#idScraped, $scrape attribute(<tagname="section">, "innertext"), "Global") add item to list(%idScraped, #idScraped, "Don\'t Delete", "Global") } } return(#row) } define THREAD START(#row) { thread { add item to list(%rows, $scrapeFunction(%idScraped), "Delete", "Global") wait(1) } } set(#idtoProcessCnt, $table total rows(&idsToProcess), "Global") set(#row, 0, "Global") loop(#idtoProcessCnt) { in new browser { THREAD START(#row) increment(#row) wait(0.5) } } Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.