简体   繁体   中英

What is the pythonic way to create a Dataframe from a list of Nested Dictionary Structures (with two levels)?

I am receiving requests in the following format (I cannot change the input request format):

{  "inputs":
    [
       {
           "TimeGenerated": "datetimestring",
           "counters": {
               "counter1": float_value,
               "counter2": float_value,
               "counter3": float_value
            }
      },
      {
           "TimeGenerated": "datetimestring",
           "counters": {
               "counter1": float_value,
               "counter2": float_value,
               "counter3": float_value
            }
      },
      {
           "TimeGenerated": "datetimestring",
           "counters": {
               "counter1": float_value,
               "counter2": float_value,
               "counter3": float_value
           }
      }
    ]
}

I want to create a DataFrame out of this dictionary with columns: TimeGenerated, counter1, counter2, counter3 .

What is the most effective pythonic way to create a DataFrame out of this list of nested dictionaries?


Possible Solution (Not the Most Efficient One)

The solution, I have found is:

x = []
for i in input_json['inputs']:
        counters = i['counters']                   # We do not want counters in the column headers. This returns the dictionary { "counter1": float_value, "counter2": float_value, "counter3": float_value}
        counters['_time'] = i['TimeGenerated']     # The idea to extract it and then add it to the common dictionary. Counters would now be like { "counter1": float_value, "counter2": float_value, "counter3": float_value, "_time": "datetimestring"}
        x.append(counters)                         # Create a list of such dictionaries (with single level dictionaries without any nesting)
in_df = pd.DataFrame(x)                            # Create a Dataframe from the list
in_df['_time'] = pd.to_datetime(in_df['_time'])    # To convert datetimestring to datetime.

But, I am sure there are more effective ways to achieve this!


Similar Questions (with different expected end-results)

Some other questions on StackOverflow that address similar concerns (but different results are expected). Adding them for the perusal of someone who has stumbled across this while actually searching for another end-result (Also, will serve as a good comparison point to work with Python Dictionaries, Lists and DataFrames and how they are inter-related).

  1. Python Dataframe contains a list of dictionaries, need to create new dataframe with dictionary items
  2. Create pandas dataframe from nested dict with outer keys as df index and inner keys column headers
  3. Create Dataframe from a nested dictionary

Assuming all the subobjects have the same structure, you can list the keys from the first and use those for the columns.

columns = ['TimeGenerated', *j['inputs'][0]['counters'].keys()]
df = pd.DataFrame([[t['TimeGenerated'], *t['counters'].values()] for t in j['inputs']], columns=columns)

Output

>>> df
    TimeGenerated  counter1  counter2  counter3
0  datetimestring   123.456   123.456   123.456
1  datetimestring   123.456   123.456   123.456
2  datetimestring   123.456   123.456   123.456

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM