urllib2和cookielib线程安全

Question

As far as I've been able to tell cookielib isnt thread safe; 至于我已经能够告诉cookielib不是线程安全的; but then again the post stating so is five years old, so it might be wrong. 但话说说这样的帖子已经五年了，所以这可能是错的。

Nevertheless, I've been wondering - If I spawn a class like this: 不过，我一直想知道 - 如果我产生这样一个类：

class Acc:
    jar = cookielib.CookieJar()
    cookie = urllib2.HTTPCookieProcessor(jar)       
    opener = urllib2.build_opener(cookie)

    headers = {}
    def __init__ (self,login,password):
        self.user = login
        self.password = password

    def login(self):
        return False # Some magic, irrelevant

    def fetch(self,url):
        req = urllib2.Request(url,None,self.headers)
        res = self.opener.open(req)
        return res.read()

for each worker thread, would it work? 对于每个工作线程，它会起作用吗？ (or is there a better approach?) Each thread would use it's own account; （或者有更好的方法吗？）每个线程都会使用它自己的帐户; so the fact that workers wouldn't share their cookies is not a problem. 所以工人不会分享他们的饼干的事实不是问题。

Answer 1

You could see implementation of the library [python_install_path]/lib/cookielib.py to ensure that cookielib.CookieJar is thread safe . 您可以看到库[python_install_path]/lib/cookielib.py实现，以确保cookielib.CookieJar 是线程安全的 。

It means if you will share one instance of CookieJar between several connections in different threads, you will not face even inconsistence read of Cookie Set, because CookieJar uses lock self._cookies_lock inside. 这意味着如果您将在不同线程中的多个连接之间共享一个CookieJar实例，您将不会面对Cookie Set的不一致读取，因为CookieJar使用了lock self._cookies_lock 。

Answer 2

You want to use pycurl (the python interface to libcurl ). 你想使用pycurl （ libcurl的python接口）。 It's thread-safe, supports cookies, https, etc.. The interface is a bit strange, but it just takes a bit of getting used to. 它是线程安全的，支持cookie，https等。界面有点奇怪，但它只需要一点习惯。

I've only used pycurl w/ HTTPBasicAuth + SSL, but I did find an example using pycurl and cookies here . 我只用pycurl W / HTTPBasicAuth + SSL，但我没有找到使用pycurl和饼干的例子在这里。 I believe you'll need to update the pycurl.COOKIEFILE (line 74) and pycurl.COOKIEJAR (line 82) to have some unique name (maybe keying off of id(self.crl) ). 我相信你需要更新pycurl.COOKIEFILE（第74行）和pycurl.COOKIEJAR（第82行）以获得一些唯一的名称（可能是id(self.crl) ）。

As I remember, you'll need to create a new pycurl.Curl() for each request to maintain thread safety. 我记得，你需要为每个请求创建一个新的pycurl.Curl()以保持线程安全。

Answer 3

the same question as you. 和你一样的问题。 If you do not use pycurl, I think you must urllib2.install_opener(self.opener) before each urllib2.urlopen. 如果你不使用pycurl，我认为你必须在每个urllib2.urlopen之前使用urllib2.install_opener（self.opener）。

Maybe I should use the pycurl too, urllib2 is not so smart. 也许我也应该使用pycurl，urllib2不是那么聪明。

urllib2和cookielib线程安全

问题描述

3 个解决方案

解决方案1
2 2013-01-27 11:55:15

解决方案2
2 已采纳 2010-07-01 01:35:44

解决方案3
1 2011-02-10 06:08:11

urllib2和cookielib线程安全

问题描述

3 个解决方案

解决方案1 2 2013-01-27 11:55:15

解决方案2 2 已采纳 2010-07-01 01:35:44

解决方案3 1 2011-02-10 06:08:11

解决方案1
2 2013-01-27 11:55:15

解决方案2
2 已采纳 2010-07-01 01:35:44

解决方案3
1 2011-02-10 06:08:11