Urlretrieve 和用戶代理？ - Python

Question

我正在使用 urllib 模塊中的 urlretrieve。

我似乎找不到如何向我的請求添加用戶代理描述。

可以使用 urlretrieve 嗎？ 還是我需要使用其他方法？

Answer 1

首先，設置版本：

urllib.URLopener.version = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36 SE 2.X MetaSr 1.0'

然后：

filename, headers = urllib.urlretrieve(url)

Answer 2

您可以使用 URLopener 或 FancyURLopener 類。 'version' 參數指定 opener 對象的用戶代理。

opener = FancyURLopener({}) 
opener.version = 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.69 Safari/537.36'
opener.retrieve('http://example.com', 'index.html')

Answer 3

我知道這個問題已經存在 7 年了。 我通過嘗試弄清楚如何在使用urlretrieve函數時更改User-Agent來urlretrieve這個問題。

對於任何不走運就解決此問題的人，我是這樣做的：

    # proxy = ProxyHandler({'http': 'http://192.168.1.31:8888'})
    proxy = ProxyHandler({})
    opener = build_opener(proxy)
    opener.addheaders = [('User-Agent','Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4) AppleWebKit/603.1.30 (KHTML, like Gecko) Version/10.1 Safari/603.1.30')]
    install_opener(opener)

    result = urlretrieve(url=file_url, filename=file_name)

我加proxy的原因是為了監控Charles里面的流量，這里是我得到的流量：

Answer 4

我認為 urlretrieve 不可能 - 至少不容易。 我建議創建一個 urllib2.Request 對象並將所需的標頭傳遞給它。 看

http://docs.python.org/library/urllib2.html#urllib2.urlopen

舉些例子。

Answer 5

像這樣的東西不使用 urllib tho，為我工作了一個刮刀

import requests

imageURL='http://image.jpg'
headers={'user-agent': 'Mozilla/5.0'}
r=requests.get(imageURL, headers=headers)
with open('image.jpg', 'wb') as f:
    f.write(r.content)

Urlretrieve 和用戶代理？ - Python

問題描述

5 個解決方案

解決方案1
8 2015-08-22 10:37:48

解決方案2
5 2011-08-14 19:41:03

解決方案3
4 2017-04-19 16:39:03

解決方案4
2 2010-03-02 16:14:12

解決方案5
0 2021-02-11 11:48:05

Urlretrieve 和用戶代理？ - Python

問題描述

5 個解決方案

解決方案1 8 2015-08-22 10:37:48

解決方案2 5 2011-08-14 19:41:03

解決方案3 4 2017-04-19 16:39:03

解決方案4 2 2010-03-02 16:14:12

解決方案5 0 2021-02-11 11:48:05

解決方案1
8 2015-08-22 10:37:48

解決方案2
5 2011-08-14 19:41:03

解決方案3
4 2017-04-19 16:39:03

解決方案4
2 2010-03-02 16:14:12

解決方案5
0 2021-02-11 11:48:05