简体   繁体   English

在Python中从URL检索文件返回空白

[英]Retrieveing files form URL in Python returns blank

I'm working on a chess related project for which I have to download a very large quantity of files from ChessTempo . 我正在从事与国际象棋有关的项目,为此我必须从ChessTempo下载大量文件。

When running the following code: 运行以下代码时:

import urllib.request

url = "http://chesstempo.com/requests/download_game_pgn.php?gameids="

for i in range (3,500):
    urllib.request.urlretrieve(url + str(i),'Games/Game ' + str(i) + ".pgn")
    print("Downloaded file nº " + str(i))

I get the expected list of 500~ files but they are all blank except the second and third files, which have the correct data in them. 我得到了500个文件的预期列表,但是除了第二个和第三个文件(其中包含正确的数据)之外,它们都为空。

When I open the URLs by hand, it all works perfectly. 当我手动打开URL时,一切正常。 What am I missing? 我想念什么?

In fact, I can only download files 2 & 3, all others are empty... 实际上,我只能下载文件2和3,所有其他文件都为空...

Were you logged in while accessing those files "manually"? 您是在“手动”访问这些文件时登录的吗? (Which I assume to be using a web browser). (我假设使用的是网络浏览器)。

If so, FYI an http request does not only consist of the URL, lots of other information is transfered. 如果是这样,则仅供参考,http请求不仅由URL组成,还传输许多其他信息。 So if you are not getting the same information, you are almost certainly not making the same request. 因此,如果您没有获得相同的信息,则几乎可以肯定不会发出相同的请求。

In chrome you can see the requests you make within a page. 在Chrome浏览器中,您可以在页面中查看您的请求。

From Developer Tools go to Network > Select a name form the list > Request Headers ( See picture ) 开发人员工具转到网络 > 从列表中选择一个名称 > 请求标头参见图片

The most probable thing that you may be looking for are the cookies 您可能正在寻找的最可能的东西是饼干

Hope it helps. 希望能帮助到你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM