[英]What is the pythonic way to create a Dataframe from a list of Nested Dictionary Structures (with two levels)?
I am receiving requests in the following format (I cannot change the input request format):我正在接收以下格式的请求(我无法更改输入请求格式):
{ "inputs":
[
{
"TimeGenerated": "datetimestring",
"counters": {
"counter1": float_value,
"counter2": float_value,
"counter3": float_value
}
},
{
"TimeGenerated": "datetimestring",
"counters": {
"counter1": float_value,
"counter2": float_value,
"counter3": float_value
}
},
{
"TimeGenerated": "datetimestring",
"counters": {
"counter1": float_value,
"counter2": float_value,
"counter3": float_value
}
}
]
}
I want to create a DataFrame
out of this dictionary with columns: TimeGenerated, counter1, counter2, counter3
.我想用这个字典创建一个
DataFrame
列: TimeGenerated, counter1, counter2, counter3
。
What is the most effective pythonic way to create a DataFrame
out of this list of nested dictionaries?从这个嵌套字典列表中创建
DataFrame
的最有效的DataFrame
方法是什么?
The solution, I have found is:我发现的解决方案是:
x = []
for i in input_json['inputs']:
counters = i['counters'] # We do not want counters in the column headers. This returns the dictionary { "counter1": float_value, "counter2": float_value, "counter3": float_value}
counters['_time'] = i['TimeGenerated'] # The idea to extract it and then add it to the common dictionary. Counters would now be like { "counter1": float_value, "counter2": float_value, "counter3": float_value, "_time": "datetimestring"}
x.append(counters) # Create a list of such dictionaries (with single level dictionaries without any nesting)
in_df = pd.DataFrame(x) # Create a Dataframe from the list
in_df['_time'] = pd.to_datetime(in_df['_time']) # To convert datetimestring to datetime.
But, I am sure there are more effective ways to achieve this!但是,我相信有更有效的方法可以实现这一目标!
Some other questions on StackOverflow that address similar concerns (but different results are expected). StackOverflow 上的一些其他问题解决了类似的问题(但预计会有不同的结果)。 Adding them for the perusal of someone who has stumbled across this while actually searching for another end-result (Also, will serve as a good comparison point to work with Python Dictionaries, Lists and DataFrames and how they are inter-related).
添加它们以供在实际搜索另一个最终结果时偶然发现的人细读(此外,将作为一个很好的比较点来使用 Python 字典、列表和数据帧以及它们是如何相互关联的)。
Assuming all the subobjects have the same structure, you can list the keys from the first and use those for the columns.假设所有子对象具有相同的结构,您可以从第一个列出键并将它们用于列。
columns = ['TimeGenerated', *j['inputs'][0]['counters'].keys()]
df = pd.DataFrame([[t['TimeGenerated'], *t['counters'].values()] for t in j['inputs']], columns=columns)
Output输出
>>> df
TimeGenerated counter1 counter2 counter3
0 datetimestring 123.456 123.456 123.456
1 datetimestring 123.456 123.456 123.456
2 datetimestring 123.456 123.456 123.456
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.