简体   繁体   中英

Nested dictionary to pandas DataFrame -

Need help on the below nested dictionary and I want to convert this to a pandas Data Frame

Structure type:

DS = [{ 'Outer_key1.0' : [{ 'key1.0': 'data' , 'key2.0': 'data' , 'key3.0': 'data } ,
                          { 'key1.1': 'data' , 'key2.1': 'data' , 'key3.1': 'data } ,
                      { 'key1.2': 'data' , 'key2.2': 'data' , 'key3.3': 'data } ,]
         'Outer key2.0': 'data' , 
     'Outer Key3.0': 'data' }]

     [{ 'Outer_key1.1' : [{ 'key1.0': 'data' , 'key2.0': 'data' , 'key3.0': 'data } ,
                  { 'key1.1': 'data' , 'key2.1': 'data' , 'key3.1': 'data } ,
                      { 'key1.2': 'data' , 'key2.2': 'data' , 'key3.3': 'data } ,]
          'Outer key2.1': 'data' , 
      'Outer Key3.1': 'data' }]

Actual data model as mentioned below

[{'datapoints': [{'statistic': 'Minimum', 'timestamp': '2021-08-31 06:50:00.000000', 'value': 59.03},{'statistic': 'Minimum', 'timestamp': '2021-08-18 02:50:00.000000', 'value': 59.37}, {'statistic': 'Minimum', 'timestamp': '2021-08-24 16:50:00.000000', 'value': 58.84},...],'metric': 'VolumeIdleTime', 'unit': 'Seconds'}]


cc= pd.Series(DS).apply(lambda x  : pd.Series({ k: v for y in x for k, v in y.items() }))

IIUC what you need is json_normalize . Set datapoints as record_path and metric and unit as meta :

data = [{'datapoints': [{'statistic': 'Minimum', 'timestamp': '2021-08-31 06:50:00.000000', 'value': 59.03},{'statistic': 'Minimum', 'timestamp': '2021-08-18 02:50:00.000000', 'value': 59.37}, {'statistic': 'Minimum', 'timestamp': '2021-08-24 16:50:00.000000', 'value': 58.84}],'metric': 'VolumeIdleTime', 'unit': 'Seconds'}]
df = pd.json_normalize(data, record_path="datapoints", meta=["metric", "unit"])
print(df)

Output:

  statistic                   timestamp  value          metric     unit
0   Minimum  2021-08-31 06:50:00.000000  59.03  VolumeIdleTime  Seconds
1   Minimum  2021-08-18 02:50:00.000000  59.37  VolumeIdleTime  Seconds
2   Minimum  2021-08-24 16:50:00.000000  58.84  VolumeIdleTime  Seconds

@Tranbi

My JSON have the following instance of CPU data and comes with random occurrence:

Instance1 [{'datapoints': [{'statistic': 'Minimum', 'timestamp': '2021-08-31 06:50:00.000000', 'value': 59.03}, {'statistic': 'Minimum', 'timestamp': '2021-08-18 02:50:00.000000', 'value': 59.37}, {'statistic': 'Minimum', 'timestamp': '2021-08-24 16:50:00.000000', 'value': 58.84},, 'metric': 'VolumeIdleTime', 'unit': 'Seconds'}]

Instance2 [{'datapoints': [{'statistic': 'Minimum', 'timestamp': '2021-08-31 06:50:00.000000', 'value': 60}, {'statistic': 'Minimum', 'timestamp': '2021-08-18 02:50:00.000000', 'value': 55.45}, {'statistic': 'Minimum', 'timestamp': '2021-08-24 16:50:00.000000', 'value': 54.16}, {'statistic': 'Minimum', 'timestamp': '2021-08-06 07:50:00.000000', 'value': 50.03}, {'statistic': 'Minimum', 'timestamp': '2021-08-04 22:50:00.000000', 'value': 60}, {'statistic': 'Minimum', 'timestamp': '2021-08-26 01:50:00.000000', 'value': 60.34}, 'metric': 'VolumeIdleTime', 'unit': 'Seconds'}]

Instance3 [{'datapoints': [{'statistic': 'Minimum', 'timestamp': '2021-08-31 06:50:00.000000', 'value': 60}, {'statistic': 'Minimum', 'timestamp': '2021-08-18 02:50:00.000000', 'value': 38.12}, {'statistic': 'Minimum', 'timestamp': '2021-08-24 16:50:00.000000', 'value': 42.31}, {'statistic': 'Minimum', 'timestamp': '2021-08-06 07:50:00.000000', 'value': 45.22}, {'statistic': 'Minimum', 'timestamp': '2021-08-04 22:50:00.000000', 'value': 40.51}, {'statistic': 'Minimum', 'timestamp': '2021-08-26 01:50:00.000000', 'value': 34.35}, {'statistic': 'Minimum', 'timestamp': '2021-08-11 12:50:00.000000', 'value': 46.33},'metric': 'VolumeIdleTime', 'unit': 'Seconds'}]

And many more instance details to following ( Close to 8K instance information )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM