简体   繁体   English

我想使用 Python 从网页中提取 CSV 文件。 网页抓取

[英]I want to extract the CSV file from the webpage using Python. WEBSCRAPING

I want to take the.csv file, or the.xlsx file from this webpage.我想从该网页获取 .csv 文件或 .xlsx 文件。 I thought about using webscraping, using beautifulsoup, but this seems inefficient.我想过使用网络抓取,使用 beautifulsoup,但这似乎效率低下。 I want to be able to write a function that, when this webpage is called, the code locates the links to the CSV files and returns the CSV file to me.我希望能够编写一个 function ,当调用此网页时,代码会定位到 CSV 文件的链接并将 CSV 文件返回给我。

This is so that I can then follow an analysis on the CSV file.这样我就可以对 CSV 文件进行分析。

Please could someone help me out here!请有人可以在这里帮助我!

Here's the link: https://data.london.gov.uk/dataset/recorded_crime_rates这是链接: https://data.london.gov.uk/dataset/recorded_crime_rates

Use the urllib library to get the source of a webpage, .使用urllib库获取网页的源代码,.

This seems to work:这似乎有效:

import urllib.request, urllib.error, urllib.parse

url = 'https://data.london.gov.uk/dataset/recorded_crime_rates'
csvfile = r"C:\Tmp\CrimeRates.csv"

#open main page
response = urllib.request.urlopen(url)
webContent = response.read()
wc = str(webContent)

#get csv URL
i = wc.find(r"crime%20rates.csv")
i2 = wc.find("/download/recorded_crime_rates", i-200)
csvURL = "https://data.london.gov.uk" + wc[i2:i+17]
print(csvURL)

#get csv
csvresp = urllib.request.urlopen(csvURL)
csvdata = str(csvresp.read())
print(len(csvdata), "bytes")

#save csv to file
print("Saving To", csvfile)
f = open(csvfile,"w")
f.write(csvdata.replace(r"\r\n","\n"))
f.close()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用python将网页中的表提取到csv文件中 - extract tables from webpage into csv file using python 使用 Python 从 CSV 文件中的指定行和列中提取值。 无法使用 CSV 模块或 pandas 模块 - Extract value from specified row and column in CSV file using Python. Cannot use CSV module or pandas module 我想使用python将txt文件中的数据提取到csv文件 - I want to extract the data from txt files to csv file using python 我有一个CSV文件,我正在尝试使用python提取数据。 没有得到正确的结果。 有人可以帮忙吗? 附带的代码和样本数据 - I have a CSV file and I am trying to extract data using python. Not getting correct results. Can someone help? Code and sample data attached 我有一堆csv文件,我正在使用python中的pandas读取它们。 我想结合使用map和lambda函数来执行此操作 - I have a bunch of csv files, I am reading them using pandas from python. I want to use a combination of map & lambda functions to do this 使用 python beautifulsoup 从网页抓取备用版本/隐藏项目 - Webscraping an alternate version/hidden item from a webpage using python beautifulsoup 使用 scrapy 和 python 从 tsetmc.com 网页抓取网页 - webscraping from tsetmc.com webpage using scrapy and python 想要使用 ZCC8D68C151C4ADEAFDZ4DE 文件中的 python 从 web 页面中为每个人提取联系人 URL 链接 - Want to extract contact URL link for the every single person from a web page using python in CSV file 我想使用python计算文本文件中回文数。 但是我写的这个程序给我0而不是2 - I want to count the number of Palindromes in a text file using python. but this program I wrote is giving me 0 instead of 2 使用python通过网页抓取来提取字符串 - Extract a string by webscraping using python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM