如何从 pandas DataFrame 创建嵌套的 JSON？

Question

I am trying to generate a nested JSON from a DataFrame, where attributes of a car are distributed in several rows.我正在尝试从 DataFrame 生成嵌套的 JSON，其中汽车的属性分布在几行中。

DataFrame DataFrame

cars = {'brand': ['Honda','Toyota','Ford','Audi','Honda','Toyota','Ford','Audi'],
        'model': ['Civic','Corolla','Focus','A4','Civic','Corolla','Focus','A4'],
        'attributeName': ['color','color','color','color','doors','doors','doors','doors'],
        'attributeValue': ['red','blue','black','red',2,4,4,2]
        }

df = pd.DataFrame(cars)

What I tried我试过的

At first I grouped the rows and tried to apply the nesting:起初我将行分组并尝试应用嵌套：

df.groupby(['brand','model'])\
             .apply(lambda x: x[['attributeName','attributeValue']].to_dict('records'))\
             .to_json(orient='records')

Result结果

[[{"attributeName":"color","attributeValue":"red"},{"attributeName":"doors","attributeValue":2}],[{"attributeName":"color","attributeValue":"black"},{"attributeName":"doors","attributeValue":4}],[{"attributeName":"color","attributeValue":"red"},{"attributeName":"doors","attributeValue":2}],[{"attributeName":"color","attributeValue":"blue"},{"attributeName":"doors","attributeValue":4}]]

Expected result预期结果

[
    {
        'brand':'Honda',
        'model':'Civic',
        'attributes':[
            {
                'name':'color',
                'value':'red'
            }
        ]
    },
    {...}
]

So what can I do to get also the other records and not only the attributes?那么我能做些什么来获取其他记录而不仅仅是属性呢？

Answer 1

In your solution is added rename with reset_index() :在您的解决方案中添加了rename reset_index() ：

d = {'attributeName':'name','attributeValue':'value'}
j = df.rename(columns=d).groupby(['brand','model']).apply(lambda x: x[['name','value']].to_dict('records')).reset_index(name='attributes').to_json(orient='records')
print (j)
[{"brand":"Audi","model":"A4","attributes":[{"name":"color","value":"red"},{"name":"doors","value":2}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"color","value":"black"},{"name":"doors","value":4}]},{"brand":"Honda","model":"Civic","attributes":[{"name":"color","value":"red"},{"name":"doors","value":2}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"color","value":"blue"},{"name":"doors","value":4}]}]

Or:或者：

d = {'attributeName':'name','attributeValue':'value'}
j = df.rename(columns=d).groupby(['brand','model']).apply(lambda x: x[['name','value']].to_dict('records')).explode().apply(lambda x: [x]).reset_index(name='attributes').to_json(orient='records')
print (j)
[{"brand":"Audi","model":"A4","attributes":[{"name":"color","value":"red"}]},{"brand":"Audi","model":"A4","attributes":[{"name":"doors","value":2}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"color","value":"black"}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"doors","value":4}]},{"brand":"Honda","model":"Civic","attributes":[{"name":"color","value":"red"}]},{"brand":"Honda","model":"Civic","attributes":[{"name":"doors","value":2}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"color","value":"blue"}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"doors","value":4}]}]

df['attributes'] = df.apply(lambda x: [{'name': x['attributeName'], 'value': x['attributeValue']}], axis=1)
df = df.drop(['attributeName','attributeValue'], axis=1)
print (df)
    brand    model                             attributes
0   Honda    Civic    [{'name': 'color', 'value': 'red'}]
1  Toyota  Corolla   [{'name': 'color', 'value': 'blue'}]
2    Ford    Focus  [{'name': 'color', 'value': 'black'}]
3    Audi       A4    [{'name': 'color', 'value': 'red'}]
4   Honda    Civic        [{'name': 'doors', 'value': 2}]
5  Toyota  Corolla        [{'name': 'doors', 'value': 4}]
6    Ford    Focus        [{'name': 'doors', 'value': 4}]
7    Audi       A4        [{'name': 'doors', 'value': 2}]

j = df.to_json(orient='records')
print (j)
[{"brand":"Honda","model":"Civic","attributes":[{"name":"color","value":"red"}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"color","value":"blue"}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"color","value":"black"}]},{"brand":"Audi","model":"A4","attributes":[{"name":"color","value":"red"}]},{"brand":"Honda","model":"Civic","attributes":[{"name":"doors","value":2}]},{"brand":"Toyota","model":"Corolla","attributes":[{"name":"doors","value":4}]},{"brand":"Ford","model":"Focus","attributes":[{"name":"doors","value":4}]},{"brand":"Audi","model":"A4","attributes":[{"name":"doors","value":2}]}]

Answer 2

little late but this should give you the desire output:有点晚了，但这应该会给你output的愿望：

d = {'attributeName':'name','attributeValue':'value'} df_cars= (df.rename(columns=d).groupby(['brand','model']).apply(lambda x:x[['name','value']].drop_duplicates().to_dict('records')).reset_index().rename(columns={0:'attributes'})) d = {'attributeName':'name','attributeValue':'value'} df_cars= (df.rename(columns=d).groupby(['brand','model']).apply(lambda x:x [['name','value']].drop_duplicates().to_dict('records')).reset_index().rename(columns={0:'attributes'}))

df_cars.head(10) df_cars.head(10)

如何从 pandas DataFrame 创建嵌套的 JSON？

问题描述

DataFrame DataFrame

What I tried我试过的

Result结果

Expected result预期结果

2 个解决方案

解决方案1
2 已采纳 2021-03-24 08:44:05

解决方案2
0 2022-08-23 19:18:28

如何从 pandas DataFrame 创建嵌套的 JSON？

问题描述

DataFrame DataFrame

What I tried我试过的

Result结果

Expected result预期结果

2 个解决方案

解决方案1 2 已采纳 2021-03-24 08:44:05

解决方案2 0 2022-08-23 19:18:28

解决方案1
2 已采纳 2021-03-24 08:44:05

解决方案2
0 2022-08-23 19:18:28