Bot-Factory 602 Posted May 23, 2014 Report Share Posted May 23, 2014 Hello. I need to scrape a value that is not visible in the source code of the page. But when I look in the browser (view generated source) the value is there.It's probably generated via javascript. So I could use eval to run those scripts. Just wanted to ask if someone has done that before? What's your approach here? Thanks in advance Dan Quote Link to post Share on other sites
gabel 51 Posted May 23, 2014 Report Share Posted May 23, 2014 Can you post a link to that ? would like to take a look Quote Link to post Share on other sites
Bot-Factory 602 Posted May 23, 2014 Author Report Share Posted May 23, 2014 Sure:https://sellercentral.amazon.com/gp/homepage.html If you open the page in ubot and look at the generated source code. You will see: <input name="metadata1" type="hidden" value="VgkbdMX75kh01t+Mt+0RPJOnyE+v/wRz8fC2k+t9fDAsf+wlN9/b634gwhYpVmAKSjo2by3K0UP9wgeKC1Y4NLCoGyrZ0I7o405JB1XM4eMILJi3KC0Y0HQvLVKXpC1FlWnUYEnt8XZuKCNT4xwPI5w6g496ic4lOTvopQIUWsEWQe9zfLkD23sAbICL7P5ZtUcCvBOrDzc6d66TY+il7W93dVbSMaMamcCGRnSIoiRizbQ04jT1YuAWefm762MB7VxVhWTySH9mNFTvjHlS0QRCKTBH0ji+LTrS7wnpejcIN/ujwwViWbVvK1UO2OGwUaKwvKSuX/sn0xGALOvx0ECqErT1Vguujg55upHsvELkqDJic5BYL4gnnaX8fta1jMbMfnU+MPHEFeR+V3Mcxv7J+V+Tc24jVSeR3YyjLpmdUF/jJIpe5z/u99v3DjHYg8qlUpAYTFXpPwD2XnFbNvtFBcTPusmssko50SKNOzq1UoYpSgt2xi9cwvI/HyY7zh4g/bylpbMlX6A+FAfsWdWLTEqHU4gs7omvDv6gNzfbX1LzPkqvH95+srRuc51qxXRvna30p/wbFkMgNrVUAUFWRu/0TYQy82GGyhjP1JYUoB7IaZ9TeNOvNmAZhRRXgaxcaBGtnsif96tO/0w53dgcr9u0bzgtXNLUiemk99rO8n5viLio9fLdun4DaX4JgeYs="> But this is not in the regular code of the site. And when you sniff the login request in Fiddler you will see that the following parameters are used: widget token can be sniffed from the site directly:<input type="hidden" name="widgetToken" value="X2VuY29kaW5nPVVURjgmb3BlbmlkLmFzc29jX2hhbmRsZT1zY19uYV9hbWF6b24mb3BlbmlkLmNsYWltZWRfaWQ9aHR0cCUzQSUyRiUyRnNwZWNzLm9wZW5pZC5uZXQlMkZhdXRoJTJGMi4wJTJGaWRlbnRpZmllcl9zZWxlY3Qmb3BlbmlkLmlkZW50aXR5PWh0dHAlM0ElMkYlMkZzcGVjcy5vcGVaaaaRFVURjglMjYqVmVyc2lvbiolM0QxJTI2KmVudHJpZXMqJTNEMCZwYWdlSWQ9c2NfbmFfYW1hem9uJnNob3dSbXJNZT0w:ZlBSMlRuMllzdkhmb1ZoRTVvaUoyWUF6Z0VNdVFPSGxKU0tCKzAwaVNMWT06MQ=="> Dan Quote Link to post Share on other sites
gabel 51 Posted May 24, 2014 Report Share Posted May 24, 2014 Hi Dan, Sorry haven't had to much time to look into this . The js that generates that metadata1 code can be found in here <script id="fwicm-script" type="text/javascript" src="https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/login/fwcim._V342128453_.js"></script> <script type="text/javascript"> fwcim.profile('signinWidget') </script> Will try to look more when i get some time , but amazon is known to be a tough one. Quote Link to post Share on other sites
Bot-Factory 602 Posted May 24, 2014 Author Report Share Posted May 24, 2014 Thanks a lot Gabel, I will also take a look again next week. Have to finish a customer project first.Dan Quote Link to post Share on other sites
Bot-Factory 602 Posted May 28, 2014 Author Report Share Posted May 28, 2014 Update: the metadata1 variable can be found very easily.... Just use load html... and load the stuff into the browser. And then extract the metdata1 variable from there. But I'm not sure if it's the correct one then because cookies / sessions are not the same. BUT... it looks like the variable is not necessary at all... :-) There is just one issue left... plugin command("HTTP post.dll", "http container") { plugin command("HTTP post.dll", "http auto redirect", "Yes") set(#get, $plugin function("HTTP post.dll", "$http get", "https://sellercentral.amazon.com", "Mozilla/4.0 (Compatible; Windows NT 5.1; MSIE 6.0) (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)", "", "", 20), "Global") set(#widgetToken, $plugin function("HTTP post.dll", "$xpath parser index", #get, "//input[@name=\'widgetToken\']", 0, "value"), "Global") set(#username, "xx@gmail.com", "Global") set(#password, "xx", "Global") set(#postdata, "widgetToken={$plugin function("HTTP post.dll", "$url encode", #widgetToken)}&username={$plugin function("HTTP post.dll", "$url encode", #username)}&password={$plugin function("HTTP post.dll", "$url encode", #password)}", "Global") set(#post, $plugin function("HTTP post.dll", "$http post", "https://sellercentral.amazon.com/ap/widget", #postdata, "Mozilla/4.0 (Compatible; Windows NT 5.1; MSIE 6.0) (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)", "https://sellercentral.amazon.com/gp/orders-v2/list/ref=im_myo_dnav_storesett_", "", 20), "Global") set(#get, $plugin function("HTTP post.dll", "$http get", "https://www.amazon.com", "Mozilla/4.0 (Compatible; Windows NT 5.1; MSIE 6.0) (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)", "", "", 20), "Global") load html(#get)} I'm logged in to amazon. I see that when I load the amazon.com page.But for some strange reason I'm not able to load the sellercentral.amazon.com page. There it still looks like i'm not logged in... Dan Quote Link to post Share on other sites
Bot-Factory 602 Posted May 28, 2014 Author Report Share Posted May 28, 2014 Now with updated URLs (no redirects) plugin command("HTTP post.dll", "http container") { plugin command("HTTP post.dll", "http auto redirect", "No") set(#get, $plugin function("HTTP post.dll", "$http get", "https://sellercentral.amazon.com/gp/homepage.html", "Mozilla/4.0 (Compatible; Windows NT 5.1; MSIE 6.0) (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)", "", "", 20), "Global") set(#widgetToken, $plugin function("HTTP post.dll", "$xpath parser index", #get, "//input[@name=\'widgetToken\']", 0, "value"), "Global") set(#username, "xxx@gmail.com", "Global") set(#password, "xxx", "Global") set(#postdata, "widgetToken={$plugin function("HTTP post.dll", "$url encode", #widgetToken)}&username={$plugin function("HTTP post.dll", "$url encode", #username)}&password={$plugin function("HTTP post.dll", "$url encode", #password)}", "Global") set(#post, $plugin function("HTTP post.dll", "$http post", "https://sellercentral.amazon.com/ap/widget", #postdata, "Mozilla/4.0 (Compatible; Windows NT 5.1; MSIE 6.0) (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)", "https://sellercentral.amazon.com/gp/homepage.html", "", 20), "Global") set(#get, $plugin function("HTTP post.dll", "$http get", "http://www.amazon.com", "Mozilla/4.0 (Compatible; Windows NT 5.1; MSIE 6.0) (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)", "", "", 20), "Global") load html(#get) wait(2) set(#get, $plugin function("HTTP post.dll", "$http get", "https://sellercentral.amazon.com/gp/homepage.html", "Mozilla/4.0 (Compatible; Windows NT 5.1; MSIE 6.0) (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)", "", "", 20), "Global") load html(#get)} Enter your amazon credentials and give it a try. You will see that login works on the www.amazon.com siteBut not to sellercentral You amazon account should have access to sellercentral to test this :-) My account works when I try it via browser. Dan Quote Link to post Share on other sites
jamesfar 15 Posted May 29, 2014 Report Share Posted May 29, 2014 any update on this Dan? Quote Link to post Share on other sites
Bot-Factory 602 Posted May 29, 2014 Author Report Share Posted May 29, 2014 any update on this Dan? What updates are you looking for? Dan Quote Link to post Share on other sites
jamesfar 15 Posted May 29, 2014 Report Share Posted May 29, 2014 login to sellercentral via http i saw your post on aymen http thread. you said you found something and will update this thread Quote Link to post Share on other sites
Bot-Factory 602 Posted May 29, 2014 Author Report Share Posted May 29, 2014 login to sellercentral via http i saw your post on aymen http thread. you said you found something and will update this threadPost 6 and 7 are the updates I mentioned in the http plugin thread.That's all I have so far. Dan Quote Link to post Share on other sites
deliter 203 Posted May 31, 2014 Report Share Posted May 31, 2014 put the script into the load HTML command,and change input from hidden to show and it should load on the screen,only realised this myself this morning Quote Link to post Share on other sites
himanshudadhich 0 Posted May 30, 2016 Report Share Posted May 30, 2016 Deliter, i tried what you said, but i only see the blank textbox.can you give the complete solution for this.ThanksHimanshu Quote Link to post Share on other sites
deliter 203 Posted May 30, 2016 Report Share Posted May 30, 2016 Deliter, i tried what you said, but i only see the blank textbox.can you give the complete solution for this.ThanksHimanshuI was only a beginner back then,it wasn't a correct statement post site and what you are trying to do Quote Link to post Share on other sites
himanshudadhich 0 Posted May 31, 2016 Report Share Posted May 31, 2016 I want to login to https://sellercentral.amazon.in/gp/homepage.html programatically.It uses a post parameter named metadata1, which is set by a javascript.When i looked into source code i found that this value is coming from the below script. Can you help me find how they are setting this metadata1 value <script id="fwicm-script" type="text/javascript" src="https://images-na.ssl-images-amazon.com/images/G/02/x-locale/common/login/fwcim._V342129220_.js"></script><script type="text/javascript">fwcim.profile('signinWidget')</script> Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.