[英]Importing HTML code into CSV using python
I have an HTML with data that I want to bring into python and put into a CSV.我有一个包含数据的 HTML,我想将其引入 python 并放入 CSV 中。 I'm not sure which package and program will allow me to complete this as I've tried a few different ones with no success (bs4 and urllib).我不确定哪个包和程序可以让我完成这个,因为我尝试了几个不同的但没有成功(bs4 和 urllib)。
This is the HTML link:这是 HTML 链接:
https://www.cmegroup.com/CmeWS/mvc/Volume/Details/F/8478/20200807/F?tradeDate=20200807 https://www.cmegroup.com/CmeWS/mvc/Volume/Details/F/8478/20200807/F?tradeDate=20200807
Out of interest, what kind of HTML link is this?出于兴趣,这是什么类型的 HTML 链接? It appears to almost be in CSV format already.它似乎已经是 CSV 格式了。 Apologies if this is a silly question.如果这是一个愚蠢的问题,请道歉。 I've tried to search file types on the internet too.我也尝试在互联网上搜索文件类型。
I tried a URL request on this web link but received an error when trying to make the request:我在此 Web 链接上尝试了 URL 请求,但在尝试发出请求时收到错误:
from urllib.request import urlopen as uReq
cme_url = "https://www.cmegroup.com/CmeWS/mvc/Volume/Details/F/8478/20200807/F?tradeDate=20200807"
#opening up connection
uClient = uReq(cme_url)
I have scoured StackOver for examples which could solve my questions, but I was unsuccessful.我已经在 StackOver 上搜索可以解决我的问题的示例,但没有成功。 For example, this example didn't help because it's using a specifically CSV file already: Importing CSV into Python例如,这个例子没有帮助,因为它已经在使用一个专门的 CSV 文件: 将 CSV 导入 Python
I really appreciate your assistance.我非常感谢您的帮助。
The data format in the URL you provided is almost in JSON .您提供的 URL 中的数据格式几乎是JSON 。
Your question is "How to convert Json file to CSV" in fact.您的问题实际上是“如何将 Json 文件转换为 CSV”。
Python itself can solve this problem, using json library . Python本身可以解决这个问题,使用json库。
You can read json from a URL and convert it to csv in a couple steps:您可以通过几个步骤从 URL 读取 json 并将其转换为 csv:
I assume you only want the month data.我假设您只想要月份数据。
Here's the code:这是代码:
import requests
import pandas as pd
url = 'https://www.cmegroup.com/CmeWS/mvc/Volume/Details/F/8478/20200807/F?tradeDate=20200807'
r = requests.get(url)
dj = r.json()
df = pd.DataFrame(dj['monthData'])
df.to_csv('out.csv', index=False)
Output (out.csv)输出 (out.csv)
month,monthID,globex,openOutcry,totalVolume,blockVolume,efpVol,efrVol,eooVol,efsVol,subVol,pntVol,tasVol,deliveries,opnt,aon,atClose,change,strike,exercises
AUG 20,AUG-20-Calls,"10,007",0,"10,007",0,0,0,0,0,0,0,0,0,-,-,"9,372","-1,103",0,0
SEP 20,SEP-20-Calls,"1,316",0,"1,316",0,0,0,0,0,0,0,0,0,-,-,"2,899",47,0,0
OCT 20,OCT-20-Calls,115,0,115,0,0,0,0,0,0,0,0,0,-,-,614,32,0,0
NOV 20,NOV-20-Calls,16,0,16,0,0,0,0,0,0,0,0,0,-,-,68,6,0,0
DEC 20,DEC-20-Calls,13,0,13,0,0,0,0,0,0,0,0,0,-,-,105,-3,0,0
JAN 21,JAN-21-Calls,6,0,6,0,0,0,0,0,0,0,0,0,-,-,5,4,0,0
DEC 21,DEC-21-Calls,0,0,0,0,0,0,0,0,0,0,0,0,-,-,1,0,0,0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.