繁体   English   中英

Python:urllib2.HTTPError:HTTP错误401:需要身份验证

[英]Python: urllib2.HTTPError: HTTP Error 401: authenticationrequired

我试图获取一个网页,但是遇到了这个问题。 我查找了一些参考,这是我到目前为止所做的:

import sys
import urllib2
from bs4 import BeautifulSoup

user = 'myuserID'
password = "mypassword"

ip = sys.argv[1]
url = "http://www.websites.com/" + ip

passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
handler = urllib2.HTTPBasicAuthHandler(passman)
opener = urllib2.build_opener(handler)
urllib2.install_opener(opener)

header = {
    'Connection' : 'keep-alive',
    'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0',
    'Accept-Language' : 'en-US,en;q=0.5',
    'Accept-Encoding' : 'gzip, deflate'
    }
html = urllib2.urlopen(urllib2.Request(url, None, header))
soup = BeautifulSoup(html, 'html.parser')
# some if else function afterwards #

当我尝试运行脚本时,它显示了这种错误:

python checker.py 8.8.8.8
Traceback (most recent call last):
  File "checker.py", line 34, in <module>
    html = urllib2.urlopen(urllib2.Request(url, None, header))
  File "C:\Python27\lib\urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 437, in open
    response = meth(req, response)
  File "C:\Python27\lib\urllib2.py", line 550, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python27\lib\urllib2.py", line 469, in error
    result = self._call_chain(*args)
  File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 656, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "C:\Python27\lib\urllib2.py", line 437, in open
    response = meth(req, response)
  File "C:\Python27\lib\urllib2.py", line 550, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python27\lib\urllib2.py", line 475, in error
    return self._call_chain(*args)
  File "C:\Python27\lib\urllib2.py", line 409, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 558, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 401: authenticationrequired

但是,如果我打开页面或其他网页,然后手动输入我的凭据,此脚本在此之后将可以正常工作。 我想念什么吗?

只需添加一下,我当前的网络正在使用McAfee Web Gateway设备。 因此有时我们需要输入凭据才能继续浏览网络。 我们的用户/密码与Active Directory集成在一起。 这可能会导致问题吗?

这似乎工作得很好(取自另一个线程)

import urllib2
import base64
import sys

user = 'myuserID'
password = "mypassword"
ip = sys.argv[1]
url = "http://www.websites.com/" + ip
request = urllib2.Request(url)
base64string = base64.encodestring('%s:%s' % (user, password)).replace('\n', '')
request.add_header("Authorization", "Basic %s" % base64string)   
result = urllib2.urlopen(request)

或者您可以使用requests

from requests.auth import HTTPBasicAuth

user = 'myuserID'
password = "mypassword"
ip = sys.argv[1]
url = "http://www.websites.com/" + ip
res=requests.get(url , auth=HTTPBasicAuth(user, password))
print res.text

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM