简体   繁体   中英

how to logged in inside website from where we need to scrape a data

I need to scrape some data from website. Website have login form in which i have to logged in using node request package for scraping it. I am keep trying to login but it doesn't allow me. The code which i am using for login is :

var j = request.jar();
  request = request.defaults({ jar : j })
                request({
                uri:url,
                method:"POST",
                form: {UserName:"myuser1",Password:"pass1"}
                },  function (error, response, html) {
                  console.log("Status is : " + response.statusCode);
                  // console.log(response);
                  // console.log(html);
                  if (!error && response.statusCode == 302) {
                    console.log(html);
                  request.get('sub_URL',  function (error, response, html) {
                    console.log("Inner URL status is : " + response.statusCode);
                    // console.log(response);
                  if (!error && response.statusCode == 200) {
                    console.log(html);
                  }
                  });
                  }
                  else{
                    console.log("sdfsfasdgsdfgsdfgdfghghghghdfgdfgsd");
                  }
                });

I have seen some websites and also github references but it doesn't work. Please help me!!!

Thanks in Advance!!!!

If you are trying to run it in desktop only, I'd suggest nightmare.js or phantom.js which allows you to programmatically open login model and enter text input. The problem you might facing is:

  1. login screen is a iframe, frame makes scraping data complicated but it's taken cared in different frameworks by switching to the child frame like switchToFrame('child_frame')
  2. website requires you enter a confirmation code from water mark, then you have no way of getting that. In this case you will have to use cookie injection. Now it become a bit of grey zone, if it's ok to do it (I personally don't know how to determine that), you just get an chrome extension to view cookies, and export it to a json file, then use proper method from frameworks to inject them all.
  3. website requires confirmation code from google auth or sort. yea I never messed with those big guys, so I don.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM