itexspert 47 Posted March 5, 2015 Report Share Posted March 5, 2015 So i was wondering a lot of pages today have this "Contact Us" or "About Us" page now how do people usually obtain these Info i mean there are many pages out there so if i want to collect some Phones or E-mails from some official page how do you find these pages ? Is there like a regex people use or some trick you guys know that could help me?? I am trying to extract Contacting info from every website for my new project any help is greatly appreciated! Quote Link to post Share on other sites
arunner26 51 Posted March 5, 2015 Report Share Posted March 5, 2015 I let Google or Bing do the heavy lifting to find pages that may contain contact emails.Below is a list of search phrases I used in Google or Bing to identify pages that might contain emails for a campaign aimed at SE Florida SEO/Advertising companies.1 search in Google or Bing for each row below then I scrape the first 2 or 3 pages of the SERPs to get URLs: "Coral Gables" seo and (advertising or marketing) and (contact or "contact us")Hialeah seo and (advertising or marketing) and (contact or "contact us")Homestead seo and (advertising or marketing) and (contact or "contact us")"Key West" seo and (advertising or marketing) and (contact or "contact us")Miami seo and (advertising or marketing) and (contact or "contact us")"Miami Beach" seo and (advertising or marketing) and (contact or "contact us")"Fort Lauderdale" seo and (advertising or marketing) and (contact or "contact us")"Boca Raton" seo and (advertising or marketing) and (contact or "contact us")"West Palm Beach" seo and (advertising or marketing) and (contact or "contact us")"Port St Lucie" seo and (advertising or marketing) and (contact or "contact us")"Fort Pierce" seo and (advertising or marketing) and (contact or "contact us")"Vero Beach" seo and (advertising or marketing) and (contact or "contact us")Kendall seo and (advertising or marketing) and (contact or "contact us") Quote Link to post Share on other sites
itexspert 47 Posted March 6, 2015 Author Report Share Posted March 6, 2015 Hmmm, so lets say i have a website name like www.somewebsite.com Could i make a phrase like www.somewebsite.com and (contact or "contact us") You Think this Would Work? Quote Link to post Share on other sites
arunner26 51 Posted March 6, 2015 Report Share Posted March 6, 2015 Try this: (contact or "contact us") site:www.somewebsite.com Quote Link to post Share on other sites
Team_LX 3 Posted March 6, 2015 Report Share Posted March 6, 2015 Try this: (contact or "contact us") site:www.somewebsite.comJust so you know this will work but what will happen is after so many searches Google will through up a captcha Quote Link to post Share on other sites
HelloInsomnia 1103 Posted March 6, 2015 Report Share Posted March 6, 2015 Just so you know this will work but what will happen is after so many searches Google will through up a captcha You can use the site: operator in Bing and you won't ever get a captcha. You can pretty much scrape Bing without limits although sometimes you do get weird results so I would try to not use more than 1 request per second per ip but it's still way faster than Google. Quote Link to post Share on other sites
Pete 121 Posted March 6, 2015 Report Share Posted March 6, 2015 Rss is faster and friendlyEditsorry Insomnia I should of been a little clearerexample http://www.bing.com/search?format=rss&q=site%3Awestminster.gov.uk+contact+us+%40 1 Quote Link to post Share on other sites
HelloInsomnia 1103 Posted March 6, 2015 Report Share Posted March 6, 2015 Rss is faster and friendly Always check for a sitemap.xml as well, I don't know if RSS will always include a contact us page but it's a really good point not to miss that since 20% or more of the sites will have a sitemap and then maybe some will include it in the RSS that is a lot of searches saved. Quote Link to post Share on other sites
Team_LX 3 Posted March 6, 2015 Report Share Posted March 6, 2015 You can use the site: operator in Bing and you won't ever get a captcha. You can pretty much scrape Bing without limits although sometimes you do get weird results so I would try to not use more than 1 request per second per ip but it's still way faster than Google.thanks for that lol totally for get about other broswers lol Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.