[英]Downloading a file with a URL using python
我想使用 python 下载下面 url 中的文件。 我尝试使用以下代码,但似乎无法正常工作。 我认为错误在于文件格式。 如果您能建议修改代码或我可以用于此目的的新代码,我会很高兴
链接到网站
https://www.gov.uk/government/statistics/transport-use-during-the-coronavirus-covid-19-pandemic
URL 需要下载
我的代码
from urllib import request
response = request.urlopen("https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/959864/COVID-19-transport-use-statistics.ods")
csv = response.read()
csvstr = str(csv).strip("b'")
lines = csvstr.split("\\n")
f = open("historical.csv", "w")
for line in lines:
f.write(line + "\n")
f.close()
这里基本上我只想下载文件。 我听说 Beautifulsoup 可以用于此,但我对此没有太多经验。 任何符合我目的的代码都非常感谢
谢谢
我看到您只是想下载.ods
格式的文件,我认为将其保存为.csv
不会将其转换为csv
文件。
以下代码将帮助您下载文件。 我使用了requests
库,它是代替 urllib 的更好选择。
import requests
file_url = "https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/959864/COVID-19-transport-use-statistics.ods"
file_data = requests.get(file_url).content
# create the file in write binary mode, because the data we get from net is in binary
with open("historical.ods", "wb") as file:
file.write(file_data)
Output 文件可以在 MS Excel 中查看。
要下载文件:
In [1]: import requests
In [2]: url = 'https://assets.publishing.service.gov.uk/government/uploads/syste
...: m/uploads/attachment_data/file/959864/COVID-19-transport-use-statistics.
...: ods'
In [3]: with open('COVID-19-transport-use-statistics.ods', 'wb') as out_file:
...: content = requests.get(url, stream=True).content
...: out_file.write(content)
然后你可以使用pandas-ods-reader通过运行来读取文件:
pip install pandas-ods-reader
然后:
In [4]: from pandas_ods_reader import read_ods
In [5]: df = read_ods('COVID-19-transport-use-statistics.ods', 1)
In [6]: df
Out[6]:
Department for Transport statistics ... unnamed.9
0 https://www.gov.uk/government/statistics/trans... ... None
1 None ... None
2 Use of transport modes: Great Britain, since 1... ... None
3 Figures are percentages of an equivalent day o... ... None
4 None ... Percentage
.. ... ... ...
390 Transport for London Tube and Bus ... None
391 Buses (excl. London) ... None
392 Cycling ... None
393 Any other queries ... None
394 Media enquiries ... None
如果这是您想要的,您可以使用df.to_csv('my_data.csv', index=False)
将其保存为 csv
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.