简体   繁体   English

Pandas Dataframe:如何将列表中的嵌套字典展平为新行

[英]Pandas Dataframe : How to flatten nested dictionaries inside a list into new rows

I am trying to flatten API response.我正在尝试压平 API 响应。 This is the response这是回应

data = [{
            "id": 1,
            "status": "Public",
            "Options": [
                  {
                        "id": 8,
                        "pId": 9
                  },
                  {
                      "id": 10,
                        "pId": 11
                  }
               ]
},      
        {
            "id": 2,
            "status": "Public",
            "Options": [
                  {
                        "id": 12,
                        "pId": 13
                  },
                  {
                      "id": 14,
                        "pId": 15
                  }
               ]
}

]

I am trying to do this(applying ast literal eval, df.pop and json normalize).我正在尝试这样做(应用 ast 文字 eval、df.pop 和 json 规范化)。 And then i am concatinating the results然后我将结果合并

def pop(child_df, column_value):

    child_df = child_df.dropna(subset=[column_value])
    if isinstance(child_df[column_value][0], str):
        print("yes")
        child_df[column_value] = child_df[column_value].apply(ast.literal_eval)
    normalized_json = [json_normalize(x) for x in child_df.pop(column_value)]
    expanded_child_df = child_df.join(pd.concat(normalized_json, ignore_index=True, sort=False).add_prefix(column_value + '_'))
    expanded_child_df.columns = [str(col).replace('\r','') for col in expanded_child_df.columns]
    expanded_child_df.columns = map(str.lower, expanded_child_df.columns)

    return expanded_child_df

df = pd.DataFrame.from_dict(data)

df2 = pop(df,'Options')

This is the output i am getting这是我得到的 output

   id  status  options_id  options_pid
0   1  Public           8            9
1   2  Public          10           11

But the code is skipping some values inside the Options list.但是代码跳过了Options列表中的一些值。 This is the expected output这是预期的 output

   id  status  options_id  options_pid
0   1  Public           8            9
1   1  Public           10           11
2   2  Public          12           13
3   2  Public          14           15

What am i missing here?我在这里错过了什么?

you can use:您可以使用:

df=pd.json_normalize(data).explode('Options')
df=df.join(df['Options'].apply(pd.Series).add_prefix('options_')).drop(['Options'],axis=1).drop_duplicates()
print(df)
'''
   id  status  optionsid  optionspId
0   1  Public          8           9
0   1  Public         10          11
1   2  Public         12          13
1   2  Public         14          15
'''
df = pd.json_normalize(data, record_path="Options", meta=['id','status'], record_prefix='options.')
df = pd.json_normalize(data).explode('Options') tmp= df['Options'].apply(pd.Series) df = pd.concat([df[['id', 'status']], tmp], axis=1) print(df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM