简体   繁体   English

从嵌套字典结构列表(具有两个级别)创建数据框的 Pythonic 方法是什么?

[英]What is the pythonic way to create a Dataframe from a list of Nested Dictionary Structures (with two levels)?

I am receiving requests in the following format (I cannot change the input request format):我正在接收以下格式的请求(我无法更改输入请求格式):

{  "inputs":
    [
       {
           "TimeGenerated": "datetimestring",
           "counters": {
               "counter1": float_value,
               "counter2": float_value,
               "counter3": float_value
            }
      },
      {
           "TimeGenerated": "datetimestring",
           "counters": {
               "counter1": float_value,
               "counter2": float_value,
               "counter3": float_value
            }
      },
      {
           "TimeGenerated": "datetimestring",
           "counters": {
               "counter1": float_value,
               "counter2": float_value,
               "counter3": float_value
           }
      }
    ]
}

I want to create a DataFrame out of this dictionary with columns: TimeGenerated, counter1, counter2, counter3 .我想用这个字典创建一个DataFrame列: TimeGenerated, counter1, counter2, counter3

What is the most effective pythonic way to create a DataFrame out of this list of nested dictionaries?从这个嵌套字典列表中创建DataFrame的最有效的DataFrame方法是什么?


Possible Solution (Not the Most Efficient One)可能的解决方案(不是最有效的一个)

The solution, I have found is:我发现的解决方案是:

x = []
for i in input_json['inputs']:
        counters = i['counters']                   # We do not want counters in the column headers. This returns the dictionary { "counter1": float_value, "counter2": float_value, "counter3": float_value}
        counters['_time'] = i['TimeGenerated']     # The idea to extract it and then add it to the common dictionary. Counters would now be like { "counter1": float_value, "counter2": float_value, "counter3": float_value, "_time": "datetimestring"}
        x.append(counters)                         # Create a list of such dictionaries (with single level dictionaries without any nesting)
in_df = pd.DataFrame(x)                            # Create a Dataframe from the list
in_df['_time'] = pd.to_datetime(in_df['_time'])    # To convert datetimestring to datetime.

But, I am sure there are more effective ways to achieve this!但是,我相信有更有效的方法可以实现这一目标!


Similar Questions (with different expected end-results)类似的问题(具有不同的预期最终结果)

Some other questions on StackOverflow that address similar concerns (but different results are expected). StackOverflow 上的一些其他问题解决了类似的问题(但预计会有不同的结果)。 Adding them for the perusal of someone who has stumbled across this while actually searching for another end-result (Also, will serve as a good comparison point to work with Python Dictionaries, Lists and DataFrames and how they are inter-related).添加它们以供在实际搜索另一个最终结果时偶然发现的人细读(此外,将作为一个很好的比较点来使用 Python 字典、列表和数据帧以及它们是如何相互关联的)。

  1. Python Dataframe contains a list of dictionaries, need to create new dataframe with dictionary items Python Dataframe 包含一个字典列表,需要用字典项创建新的数据框
  2. Create pandas dataframe from nested dict with outer keys as df index and inner keys column headers 从嵌套字典创建熊猫数据框,外键作为 df 索引和内键列标题
  3. Create Dataframe from a nested dictionary 从嵌套字典创建数据框

Assuming all the subobjects have the same structure, you can list the keys from the first and use those for the columns.假设所有子对象具有相同的结构,您可以从第一个列出键并将它们用于列。

columns = ['TimeGenerated', *j['inputs'][0]['counters'].keys()]
df = pd.DataFrame([[t['TimeGenerated'], *t['counters'].values()] for t in j['inputs']], columns=columns)

Output输出

>>> df
    TimeGenerated  counter1  counter2  counter3
0  datetimestring   123.456   123.456   123.456
1  datetimestring   123.456   123.456   123.456
2  datetimestring   123.456   123.456   123.456

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM