[英]Converting a JSON Link into a Pandas DataFrame
Please look at the following explanation for the problem.请看下面的问题解释。 I have a JSON Data Source: https://data.cdc.gov/api/views/x8jf-txib/rows.json and I want to convert this Data into a Pandas Data frame.
I have a JSON Data Source: https://data.cdc.gov/api/views/x8jf-txib/rows.json and I want to convert this Data into a Pandas Data frame.
If you look at the JSON Dataset, it consists of MetaData and then the Actual Data.如果您查看 JSON 数据集,它由元数据和实际数据组成。 I would like to have a way in which I can store Metadata in a different file while the Dataset in a different file in my local System.
我想有一种方法可以将元数据存储在不同的文件中,而将数据集存储在本地系统的不同文件中。
I have developed this method and I am not able to get it completely work for me:我已经开发了这种方法,但我无法让它完全适合我:
from urllib.request import urlopen
import json
# Get the dataset
url = "https://data.cdc.gov/api/views/x8jf-txib/rows.json"
response = urlopen(url)
# Convert bytes to string type and string type to dict
string = response.read().decode('utf-8')
json_obj = json.loads(string)
The above Step converts the JSON File in a Dictionary and when I try to convert it into Pandas Dataframe by using this:上述步骤将 JSON 文件转换为字典,当我尝试将其转换为 Pandas Dataframe 时:
pd.DataFrame([json_obj.items()])
I get the output as this:我得到 output 如下:
Please help me for this.请帮助我。 I appreciate it.
我很感激。
In Python, json.loads
gives you back a map/object if the JSON string was parsed properly.在 Python 中,如果 JSON 字符串被正确解析,
json.loads
会返回一个映射/对象。 I think what you want to construct the DataFrame
is the following:我认为您要构建
DataFrame
的内容如下:
df = pd.DataFrame.from_records(json_obj['data'])
Here's a working script:这是一个工作脚本:
import pandas as pd
from urllib.request import urlopen
import json
# Get the dataset
url = "https://data.cdc.gov/api/views/x8jf-txib/rows.json"
response = urlopen(url)
# Convert bytes to string type and string type to dict
string = response.read().decode('utf-8')
json_obj = json.loads(string)
df = pd.DataFrame.from_records(json_obj['data'])
print(df.head())
You should get output that looks something like:你应该得到 output 看起来像:
0 1 2 ... 38 39 40
0 row-ss5i~ibqh-im6e 00000000-0000-0000-E6C3-33C094361E41 0 ... None None None
1 row-7jrs-n8wf_crzs 00000000-0000-0000-22EC-13B75E5E7127 0 ... None None None
2 row-ddqq-yzd7.yyhz 00000000-0000-0000-319D-A1D4FB17A377 0 ... None None None
3 row-kzem-t4xs.n4ss 00000000-0000-0000-6ED5-CF3857CC1862 0 ... None None None
4 row-9ws9-2nrx~xqqg 00000000-0000-0000-3403-E46EFF15AE5B 0 ... POINT (-89.148632 40.124144) 1721 34
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.