Jump to content
UBot Underground

Scraping results from Facebook


Recommended Posts

I haven't touched Ubot in a few years, so I'm really REALLY rusty. I'm sure there are some things I'm forgetting in scraping in general...but I've reviewed the videos and am having a hard time isolating the data I want.

 

There is a Facebook page for a business that I work with where we're running a contest, and we need to pull all the "likers" from the recent history. I have the URL for this data, but there's no export. Hence the need to create my own scraping solution (as opposed to copying/pasting). I'll be pulling this type of data several times per year, so it's worth it to create a bot.

 

Sample data I'm trying to target:

<div class="fsl fwb fcb"><a href="https://www.facebook.com/some.user?fref=pb&hc_location=profile_browser" data-gt="{"engagement":{"eng_type":"1","eng_src":"2","eng_tid":"100001243555706","eng_data":[]}}">Some User</a></div>

There are a number of these on the page, and I need to scrape each and every one of them. The goal is to get it to output like this in the debugger for the %people list...

Some User
Some Other User
Another User
A fourth User
...
...
...

The bot isn't robust yet with special features and data saving, but more or less here's what it looks like...

bot source command(25, "me@mydomain.com", "xxxxxxxxx")
navigate("https://www.facebook.com/browse/page_fans/?page_id=16553426645562", "Wait")
add list to list(%people, $scrape attribute(_______, "innertext"), "Delete", "Global")

 

 

It's the underscored area that I can't quite seem to figure out. The following regex does seem to work in another regex application I use:

<div class="fsl fwb fcb"><a.*?>(.*)?</a></div>

Where am I going wrong? Is there something wrong with how I'm choosing to go about it? I don't even know where to start with this.

 

Ultimately, I'll need the bot to click the "show more" at the bottom lots of times or something, since they seem to show 14 results, then you click it (no page refresh, ajax) and you get another 14 each time. Wouldn't it be nice if Facebook just gave us an Excel export? WTF.

 

Anyways, thanks for anyone who can reply and help.  :mellow:

Link to post
Share on other sites

Actually, I think the system was auto-suggesting the not-so-best selector. I modified it a little and came up with this...

 

add list to list(%people$scrape attribute(<class="fsl fwb fcb">"innertext"), "Delete""Global")

 

...which produced a list of 14 people, nice clean list of names. Next is to work on the URLs for each of them so they can all be saved in a spreadsheet. Baby steps.

 

Again, if anyone has any tips or examples for working with this kind of data, I'd greatly appreciate it!

Link to post
Share on other sites

Thank you.

 

Actually, Facebook doesn't let access to this information via the API. Which is the purpose of this bot.

 

I'm onto something, though. Think this is all starting to make sense.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...