简体   繁体   English

在python上访问带有基本身份验证的网页

[英]Accessing a web page with basic auth on python

I'm trying to connect a web page with mechanize but I'm getting a http 401 error. 我正在尝试使用机械化连接网页,但我收到了http 401错误。

Here's my code; 这是我的代码;

import base64, mechanize

url = "http://www.dogus.edu.tr/dusor/FrmMain.aspx"
user = "user"
pwd = "pwd"

br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]

br.add_password(url, user, pwd)
#br.addheaders.append(('Authorization', 'Basic %s' % base64.encodestring('%s:%s' % (user, pwd))))
print br.open(url).read()

Both add_password and addheaders are not working. add_passwordaddheaders都不起作用。 Is it because I never specified a realm? 是因为我从未指定过领域吗? How can I get what realm is that web page using? 我怎样才能获得该网页使用的领域? The username and password that I'm using are correct, as I can login using chrome with those credentials. 我正在使用的用户名和密码是正确的,因为我可以使用带有这些凭据的chrome登录。

The site you are using as a sample page requires the NTLM authentication. 您用作示例页面的站点需要NTLM身份验证。 You can see this by looking at the returned HEADER fields. 您可以通过查看返回的HEADER字段来查看此信息。 For example curl -I http://www.dogus.edu.tr/dusor/FrmMain.aspx returns: 例如curl -I http://www.dogus.edu.tr/dusor/FrmMain.aspx返回:

HTTP/1.1 401 Unauthorized
Content-Length: 1293
Content-Type: text/html
Server: Microsoft-IIS/7.0
WWW-Authenticate: Negotiate
WWW-Authenticate: NTLM
X-Powered-By: ASP.NET
Date: Mon, 07 Apr 2014 21:24:09 GMT

The line WWW-Authenticate: NTLM says, which authentication method is used. WWW-Authenticate: NTLM说,使用哪种身份验证方法。 I think the answer to this question Use python mechanize to log into pages with NTLM authentication will help you. 我认为这个问题的答案使用python mechanize登录到具有NTLM身份验证的页面将对您有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM