简体   繁体   中英

Login to web page using HttpWebRequest

I am trying to login to the website below using HttpWebRequest. I am passing in the username and password using the Credentials property but keep getting back the Login page of the website. Can anyone explain what I am doing wrong.

https://oyster.tfl.gov.uk/oyster/entry.do (Login Page)

HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(Url);
request.ContentType = "application/x-www-form-urlencoded"; 
request.Credentials = new NetworkCredential(Username, Password);
request.Method = "POST";
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
request.Headers.Add("Accept-Language: en-us,en;q=0.5");
request.Headers.Add("Accept-Encoding: gzip,deflate");
request.Headers.Add("Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7");
request.KeepAlive = true;
request.Headers.Add("Keep-Alive: 300");
request.Referer = Url;
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705;)";

HttpWebResponse response = (HttpWebResponse)request.GetResponse();

using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
    string tmp = reader.ReadToEnd();
}

note the Terms and Conditions :

The following are prohibited [...snip...]

Use of any automated system, software or process to extract content and/or data, including trawling, data mining and screen scraping.

Credentials is for basic/etc http security - not forms-based security.

It would be better to use an API if one exists. HTML forms are meant for humans, not computers. It looks like there is a beta TfL API here .

That page has no HTTP authentication (Basic, Digest, NTLM) on it, so Credentials will do nothing.

You need to construct a POST to /oyster/security_check that sends the username and password as content (the data to send will look the same what you would see in the query string if the form were a GET, eg username=myName&password=myPass . Maintain the cookie from this for subsequent requests.

Here is a C# class that you might find quite useful.

It's quite simple to use and has basic functions for downloading a string or byte array. It also scans the login page form for things like authentication tokens that some websites use to prevent programmatic authentication. I have tried it with a number of websites such as Facebook and it seems to work just fine.

[Link Removed]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM