webtrend 1 Posted March 26, 2015 Report Share Posted March 26, 2015 Here is the relevant html code: <div class="mctitle"><a target="_blank" href="http://twicsy.com/" rel="nofollow">Celebrities That Go Braless</a></div> The following ubot block is yielding me a blank list with 22 items (although the list count of 22 is accurate) $Scrape_attributeElement to Scrape: <class="mctitle">Attribute to Scrape: href Any ideas? Quote Link to post Share on other sites
deliter 203 Posted March 26, 2015 Report Share Posted March 26, 2015 set(#links,$scrape attribute(<target="_blank">,"href"),"Global") Ive just loaded that link on a blank html page,if the above code scrapes more than the links you need,maybe adjust to target=blank and class= etc Quote Link to post Share on other sites
LordFrz 3 Posted March 26, 2015 Report Share Posted March 26, 2015 Try attribute 'fullhref' Quote Link to post Share on other sites
webtrend 1 Posted March 26, 2015 Author Report Share Posted March 26, 2015 thanks for the response guys. target="_blank" does grab those links but with many others that I don't want. Adding a boolean AND condition with class="mctitle" does not give me any results fullhref is giving me links that I don't even know where it is pulling from. Looks like I will just have to use regex. Is there any other way around it? Quote Link to post Share on other sites
Bot-Factory 602 Posted March 26, 2015 Report Share Posted March 26, 2015 Can you please share the full code? So the URL from where you try to scrape that <class="mctitle">. If you can share that, I'll take a look.Dan Quote Link to post Share on other sites
webtrend 1 Posted March 27, 2015 Author Report Share Posted March 27, 2015 (edited) Can you please share the full code? So the URL from where you try to scrape that <class="mctitle">. If you can share that, I'll take a look.Dan Problem solved, please look at my next post. Thanks, Dan for your offer to help Edited March 27, 2015 by webtrend Quote Link to post Share on other sites
LordFrz 3 Posted March 27, 2015 Report Share Posted March 27, 2015 (edited) I dont know, you might try scraping the innerhtml, then look for the regular expression between the start of the href and the ending quotations. Edited March 27, 2015 by LordFrz Quote Link to post Share on other sites
webtrend 1 Posted March 27, 2015 Author Report Share Posted March 27, 2015 ^^^ What you are recommending works but it is scraping a lot more URLs than the other lists. I want all the 3 lists to be consistent so that each numbered item on a given list corresponds to the same numbered items on the other lists. I would like to stay within that class and scrape the href attribute of the class. The html document seems to be well structured so I don't understand why this simple command is not working. Quote Link to post Share on other sites
webtrend 1 Posted March 27, 2015 Author Report Share Posted March 27, 2015 (edited) nevermind, I figured out the solution. You have to use $element_child block to read the "a" tag and then choose "href" as the attribute. like this: add list to list(%links,$scrape attribute($element child(<class="mctitle">),"href"),"Don\'t Delete","Global") Edited March 27, 2015 by webtrend Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.