Jump to content
UBot Underground

Newbie Question regarding Google Search Result scraping


Recommended Posts

Hi, I'm new to ubot and trying to follow the tutorial videos.I'm now at video 5 

(my problem is at 14:30)

 

It's about scraping google search results. My problem is, that google changed how the search results are delivered and the method described in the video did not works for me.

 

Here is a screenshot of the advanced element editor. As you can see there is no class attribute: http://screencast.com/t/q0HnY6Ww

 

Any idea, how I can get all search results into my list?? 

 

Thanks for any tips ;)

Link to post
Share on other sites
  • 2 weeks later...

try this

 

clear list(%scrape url)
add list to list(%scrape url$scrape attribute(<class=w"_zd*">"innertext"), "Delete""Global")

 

works here

 

see the <cite class is _zd then make the rest a wild card because they are different except _zd

 

use a web inspector either ubots or fire bug for fire fox and look at what is the same for each element you wish to scrape then wild card the rest. so look at a few to see. where they are the same or different.

 

TC

Link to post
Share on other sites

This code didnt worked.can you please helpon how to scrape urls from google.

try this

 

clear list(%scrape url)
add list to list(%scrape url$scrape attribute(<class=w"_zd*">"innertext"), "Delete""Global")

 

works here

 

see the <cite class is _zd then make the rest a wild card because they are different except _zd

 

use a web inspector either ubots or fire bug for fire fox and look at what is the same for each element you wish to scrape then wild card the rest. so look at a few to see. where they are the same or different.

 

TC

Link to post
Share on other sites

here you go I looked at my gttp and remembered how I did it was like months ago but still good stuff and this should work for you fuys....remember just a quick example here

clear list(%keywords)
add list to list(%keywords, $list from text("purple
green
yellow
blue
red", $new line), "Delete", "Global")
clear list(%scrape url)
set user agent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0")
loop($list total(%keywords)) {
    set(#KW_next_item, $next list item(%keywords), "Global")
    clear cookies
    navigate("https://www.google.com/", "Wait")
    wait for element(<innertext="Google Search">, 10, "Appear")
    type text(<name="q">, "{#KW_next_item} balloons", "Standard")
    click(<name="btnK">, "Left Click", "No")
    wait($rand(3, 10))
    wait for element(<innertext="Help">, 10, "Appear")
    add list to list(%scrape url, $list from text($find regular expression($scrape attribute(<class="r">, "outerhtml"), "(?<=<h3 class=\"r\"><a href=\").*?(?=\" onm)"), $new line), "Delete", "Global")
}
ui stat monitor("urls: {$list total(%scrape url)}", "")

hope that helps,

 

TC

 

Link to post
Share on other sites

here you go I looked at my http code and remembered how I did it was like months ago but still good stuff and this should work for you fuys....remember just a quick example here to show page scrape not navigate and all other stuff

clear list(%keywords)
add list to list(%keywords, $list from text("purple
green
yellow
blue
red", $new line), "Delete", "Global")
clear list(%scrape url)
set user agent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0")
loop($list total(%keywords)) {
    set(#KW_next_item, $next list item(%keywords), "Global")
    clear cookies
    navigate("https://www.google.com/", "Wait")
    wait for element(<innertext="Google Search">, 10, "Appear")
    type text(<name="q">, "{#KW_next_item} balloons", "Standard")
    click(<name="btnK">, "Left Click", "No")
    wait($rand(3, 10))
    wait for element(<innertext="Help">, 10, "Appear")
    add list to list(%scrape url, $list from text($find regular expression($scrape attribute(<class="r">, "outerhtml"), "(?<=<h3 class=\"r\"><a href=\").*?(?=\" onm)"), $new line), "Delete", "Global")
}
ui stat monitor("urls: {$list total(%scrape url)}", "")

hope that helps,

 

TC

  • Like 1
Link to post
Share on other sites

A bit complicated for a newbie..but it works fine..i need your suggestion..can you tell me where can i learn more about ubot?

here you go I looked at my http code and remembered how I did it was like months ago but still good stuff and this should work for you fuys....remember just a quick example here to show page scrape not navigate and all other stuff

clear list(%keywords)
add list to list(%keywords, $list from text("purple
green
yellow
blue
red", $new line), "Delete", "Global")
clear list(%scrape url)
set user agent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0")
loop($list total(%keywords)) {
    set(#KW_next_item, $next list item(%keywords), "Global")
    clear cookies
    navigate("https://www.google.com/", "Wait")
    wait for element(<innertext="Google Search">, 10, "Appear")
    type text(<name="q">, "{#KW_next_item} balloons", "Standard")
    click(<name="btnK">, "Left Click", "No")
    wait($rand(3, 10))
    wait for element(<innertext="Help">, 10, "Appear")
    add list to list(%scrape url, $list from text($find regular expression($scrape attribute(<class="r">, "outerhtml"), "(?<=<h3 class=\"r\"><a href=\").*?(?=\" onm)"), $new line), "Delete", "Global")
}
ui stat monitor("urls: {$list total(%scrape url)}", "")

hope that helps,

 

TC

Link to post
Share on other sites

http://ubotstudio.com/tutorials

 

http://wiki.ubotstudio.com/wiki/Main_Page

 

 

put this in your favorite search engine

 

site:udotstudio.com

 

then what ever you are looking to learn like this

 

site:ubotstudio.com how to scrape

 

 

TC

Link to post
Share on other sites

Thanks Traffik Cop i would like to add you as a friend,please accept me.

http://ubotstudio.com/tutorials

 

http://wiki.ubotstudio.com/wiki/Main_Page

 

 

put this in your favorite search engine

 

site:udotstudio.com

 

then what ever you are looking to learn like this

 

site:ubotstudio.com how to scrape

 

 

TC

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...