Python3，Urllib.request，urlopen（）超时

Question

I'm using urlopen() to open a website and pull (financial) data from it. 我正在使用urlopen（）打开网站并从中提取（财务）数据。 Here is my line: 这是我的台词：

sourceCode = urlopen('xxxxxxxx').read()

After this, I then pull the data I need out. 之后，我将需要的数据提取出来。 I loop through different pages on the same domain to pull data (stock info). 我遍历同一域中的不同页面以提取数据（股票信息）。 I end the body of the loop with: 我以以下内容结束循环的主体：

time.sleep(1)

as I'm told that keeps the site from blocking me. 有人告诉我，该网站不会阻止我。 My program will run for a few minutes, but at some point, it stalls and quits pulling data. 我的程序将运行几分钟，但有时会停止并退出提取数据。 I can rerun it and it'll run another arbitrary amount of time and then stall. 我可以重新运行它，它将再运行任意时间，然后停顿。

Is there something I can do to prevent this? 有什么我可以防止的事情吗？

Answer 1

This worked (for most websites) for me: 这对我来说（对大多数网站而言）有效：

If you're using the urllib.request library, you can create a Request and spoof the user agent. 如果您使用的是urllib.request库，则可以创建一个Request并欺骗用户代理。 This might mean that they stop blocking you. 这可能意味着他们不再阻止您。

from urllib.request import Request, urlopen
req = Request(path, headers={'User-Agent': 'Mozilla/5.0})
data = urlopen(req).read()

Hope this helps 希望这可以帮助

Python3，Urllib.request，urlopen（）超时

问题描述

1 个解决方案

解决方案1
-1 2019-03-05 21:26:31

Python3，Urllib.request，urlopen（）超时

问题描述

1 个解决方案

解决方案1 -1 2019-03-05 21:26:31

解决方案1
-1 2019-03-05 21:26:31