[英]Convert this nested JSON to pandas dataframe
這是我的 json:
my_json = {
"machine_name": {
"0100": [
{
"date": "21/03/2019",
"chainage": "27760.156",
"unix_time": "1553110535",
"time": "03:35:35",
"f0001": "0.0",
"f0002": "0.0",
"f0006": "0.0",
"f0007": "0.0",
"f0008": "0.0",
"f0009": "0.0"
},
{
"date": "22/03/2019",
"chainage": "27760.156",
"unix_time": "1553110535",
"time": "03:35:35",
"f0001": "0.0",
"f0002": "0.0",
"f0006": "0.0",
"f0007": "0.0",
"f0008": "0.0",
"f0009": "0.0"
}
],
"0101": [
{
"date": "21/03/2019",
"chainage": "27761.498",
"unix_time": "1553131029",
"time": "09:17:09",
"f0001": "0.347",
"f0002": "0.007",
"f0006": "2.524",
"f0007": "0.0",
"f0008": "121.036",
"f0009": "0.0"
},
{
"date": "22/03/2019",
"chainage": "27761.498",
"unix_time": "1553131029",
"time": "09:17:09",
"f0001": "0.347",
"f0002": "0.007",
"f0006": "2.524",
"f0007": "0.0",
"f0008": "121.036",
"f0009": "0.0"
}
]
}
}
我想創建一個Z3A43B4F88325D9402C0EFA9C2FA2F5F5AZ Z6A8064B5DF4794555555555555555555555555555057DZ帶標頭“ date”,“ cainyage”,“ unix_time”,“ unix_time”,“ unix_time”,“ unix_time”,“ unix_time”,等待對象。
我查看了 read_json 和 json_normalize,但輸出不是預期的。 任何想法如何達到預期的結果?
>>> rows = [v[0] for k, v in my_json['machine_name'].items()]
>>> rows # I fixed up the line-wrapping here for readability.
[{'date': '21/03/2019', 'chainage': '27760.156', 'unix_time': '1553110535',
'time': '03:35:35', 'f0001': '0.0', 'f0002': '0.0', 'f0006': '0.0',
'f0007': '0.0', 'f0008': '0.0', 'f0009': '0.0'}, {'date': '21/03/2019',
'chainage': '27761.498', 'unix_time': '1553131029', 'time': '09:17:09',
'f0001': '0.347', 'f0002': '0.007', 'f0006': '2.524', 'f0007': '0.0',
'f0008': '121.036', 'f0009': '0.0'}]
這給了我們一個包含在單元素列表中的實際字典列表,這些列表是machine_name
下的字典值,然后我們可以正常制作一個表:
>>> df = pd.DataFrame(rows)
並添加索引:
# we need to convert to Index explicitly from the dict_keys.
>>> index = pd.Index(my_json['machine_name'].keys())
>>> df.set_index(index, inplace=True)
結果對我來說是正確的:
>>> df
chainage date f0001 f0002 ... f0008 f0009 time unix_time
0100 27760.156 21/03/2019 0.0 0.0 ... 0.0 0.0 03:35:35 1553110535
0101 27761.498 21/03/2019 0.347 0.007 ... 121.036 0.0 09:17:09 1553131029
[2 rows x 10 columns]
以下似乎有效。
請注意,鑒於您聲明my_json
變量的方式,Python 會將其讀取為字典,而不是 json 字符串。
import pandas as pd
my_json = { ... } # you data --it will be read as a dictionary by default.
# for convenience, create this variable
d = my_json['machine_name']
# create a list to store each row (i.e.: 0100, 0101)
dict_ls = []
# loop through d and store each internal dictionary (i.e: d[0100], d[0101]...etc) in the list
for row in d.keys():
dict_ls.append(d[row][0])
# convert the list of dictionaries into a dataframe
df = pd.DataFrame(dict_ls)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.