Jump to content
UBot Underground

Why Am I Not Scraping Href Attribute


Recommended Posts

Here is the relevant html code:

<div class="mctitle"><a target="_blank" href="http://twicsy.com/" rel="nofollow">Celebrities That Go Braless</a></div>

The following ubot block is yielding me a blank list with 22 items (although the list count of 22 is accurate)

 

$Scrape_attribute

Element to Scrape: <class="mctitle">

Attribute to Scrape: href

 

Any ideas?

Link to post
Share on other sites

set(#links,$scrape attribute(<target="_blank">,"href"),"Global")

 

Ive just loaded that link on a blank html page,if the above code scrapes more than the links you need,maybe adjust to target=blank and class= etc

Link to post
Share on other sites

thanks for the response guys. 

 

target="_blank" does grab those links but with many others that I don't want. Adding a boolean AND condition with class="mctitle" does not give me any results

 

fullhref is giving me links that I don't even know where it is pulling from.

 

Looks like I will just have to use regex. Is there any other way around it?

Link to post
Share on other sites

Can you please share the full code? So the URL from where you try to scrape that <class="mctitle">. 

If you can share that, I'll take a look.

Dan

 



Problem solved, please look at my next post.

 

Thanks, Dan for your offer to help

Edited by webtrend
Link to post
Share on other sites

I dont know, you might try scraping the innerhtml, then look for the regular expression between the start of the href and the ending quotations.

Edited by LordFrz
Link to post
Share on other sites

^^^ What you are recommending works but it is scraping a lot more URLs than the other lists. I want all the 3 lists to be consistent so that each numbered item on a given list corresponds to the same numbered items on the other lists.

 

I would like to stay within that class and scrape the href attribute of the class. The html document seems to be well structured so I don't understand why this simple command is not working. 

Link to post
Share on other sites

nevermind, I figured out the solution. 

 

You have to use $element_child block to read the "a" tag and then choose "href" as the attribute.

 

like this:

add list to list(%links,$scrape attribute($element child(<class="mctitle">),"href"),"Don\'t Delete","Global")
Edited by webtrend
Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...