![](/img/trans.png)
[英]How to create a complex dictionary into Pandas DataFrame in streaming data
[英]How to create a complex dictionary with nested dictionary in a list into Pandas DataFrame in streaming data
我正在尝试从这里的复杂字典中创建一个 dataframe 但我无法解决最后一列的值,如果你能指导我那就太好了!
代码 -
import pandas as pd
from pandas.io.json import json_normalize
stream= {
"Outerclass": {
"Main_ID": "1",
"SetID": "1041",
"Version": 2,
"nestedData": {
"time": ["5000", "6000", "7000"],
"values": [{"intValue":1,"value":"intValue"}, {"floatValue":2.5,"value":"floatValue"}, {"stringValue":"abc","value":"stringValue"}]
}
} }
s = json_normalize(stream['Outerclass'])
s = s.join(pd.concat([s.pop(x).explode() for x in ['nestedData.time','nestedData.values']],axis=1))
print(s)
期望的输出-
Main_ID SetID Version nestedData.time nestedData.values
1 1041 2 5000 1
1 1041 2 6000 2.5
1 1041 2 7000 abc
实际 Output -
Main_ID SetID Version nestedData.time nestedData.values
1 1041 2 5000 {'intValue': 1, 'value': 'intValue'}
1 1041 2 6000 {'floatValue': 2.5, 'value': 'floatValue'}
1 1041 2 7000 {'stringValue': 'abc', 'value': 'stringValue'}
由于您想根据自定义逻辑(不一定是经典的 json 规范化)提取其中一些字段,这需要忽略第一个键子文本并基本上获取与不是value
,我建议如下:
# Almost same as before
s = pd.json_normalize(stream['Outerclass']) #pandas.io.json.json_normalize is actually deprecated
s = s.join(pd.concat(
[s.pop(x).explode() for x in ['nestedData.time','nestedData.values']], axis=1)
).reset_index(drop=True)
def get_value(d):
""" Extract value from any key that is not 'value' """
k = [k for k in d.keys() if k != 'value'][0]
return d.get(k)
s["Values"] = s["nestedData.values"].apply(get_value) # Or you can just replace it
s
生成的 dataframe 如下所示:
Main_ID SetID Version nestedData.time nestedData.values
0 1 1041 2 5000 1
1 1 1041 2 6000 2.5
2 1 1041 2 7000 abc
另外,请注意替代方案,仅使用 json 规范化:
pd.json_normalize(s["nestedData.values"])
会产生一个表,每个可能的键作为一列,如预期的那样(但不是预期的):
intValue value floatValue stringValue
0 1.0 intValue NaN NaN
1 NaN floatValue 2.5 NaN
2 NaN stringValue NaN abc
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.