[英]Pandas Dataframe : How to flatten nested dictionaries inside a list into new rows
I am trying to flatten API response.我正在尝试压平 API 响应。 This is the response
这是回应
data = [{
"id": 1,
"status": "Public",
"Options": [
{
"id": 8,
"pId": 9
},
{
"id": 10,
"pId": 11
}
]
},
{
"id": 2,
"status": "Public",
"Options": [
{
"id": 12,
"pId": 13
},
{
"id": 14,
"pId": 15
}
]
}
]
I am trying to do this(applying ast literal eval, df.pop and json normalize).我正在尝试这样做(应用 ast 文字 eval、df.pop 和 json 规范化)。 And then i am concatinating the results
然后我将结果合并
def pop(child_df, column_value):
child_df = child_df.dropna(subset=[column_value])
if isinstance(child_df[column_value][0], str):
print("yes")
child_df[column_value] = child_df[column_value].apply(ast.literal_eval)
normalized_json = [json_normalize(x) for x in child_df.pop(column_value)]
expanded_child_df = child_df.join(pd.concat(normalized_json, ignore_index=True, sort=False).add_prefix(column_value + '_'))
expanded_child_df.columns = [str(col).replace('\r','') for col in expanded_child_df.columns]
expanded_child_df.columns = map(str.lower, expanded_child_df.columns)
return expanded_child_df
df = pd.DataFrame.from_dict(data)
df2 = pop(df,'Options')
This is the output i am getting这是我得到的 output
id status options_id options_pid
0 1 Public 8 9
1 2 Public 10 11
But the code is skipping some values inside the Options
list.但是代码跳过了
Options
列表中的一些值。 This is the expected output这是预期的 output
id status options_id options_pid
0 1 Public 8 9
1 1 Public 10 11
2 2 Public 12 13
3 2 Public 14 15
What am i missing here?我在这里错过了什么?
you can use:您可以使用:
df=pd.json_normalize(data).explode('Options')
df=df.join(df['Options'].apply(pd.Series).add_prefix('options_')).drop(['Options'],axis=1).drop_duplicates()
print(df)
'''
id status optionsid optionspId
0 1 Public 8 9
0 1 Public 10 11
1 2 Public 12 13
1 2 Public 14 15
'''
df = pd.json_normalize(data, record_path="Options", meta=['id','status'], record_prefix='options.')
df = pd.json_normalize(data).explode('Options') tmp= df['Options'].apply(pd.Series) df = pd.concat([df[['id', 'status']], tmp], axis=1) print(df)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.