简体   繁体   English

Pandas:嵌套 json/dict 到 json_normalize DataFrame 和 json_normalize DataFrame 到嵌套 json/dict

[英]Pandas: nested json/dict to json_normalize DataFrame and json_normalize DataFrame to nested json/dict

I have a large amount of JSON data and I want to perform some tasks.我有大量的 JSON 数据,我想执行一些任务。 so I choose pandas for this.所以我为此选择 pandas。

I have a nested json like this:我有一个像这样的嵌套 json :

json_data = [
    {
        "item": "Item1",
        "lowestPrice": {
            "price": 11.00,
            "currency": "EUR",
        },
    },
    {
        "item": "Item2",
        "lowestPrice": {
            "price": 12.00,
            "currency": "EUR",
        }
    },
    {
        "item": "Item3",
        "lowestPrice": {
            "price": 13.00,
            "currency": "EUR",
        }
    }
]

and i used json_normalize() to normalize nested json like:我使用 json_normalize() 对嵌套的 json 进行标准化,例如:

df = pd.json_normalize(json_data, max_level=2)

        item  lowestPrice.price lowestPrice.currency
0  Item1               11.0                  EUR
1  Item2               12.0                  EUR
2  Item3               13.0                  EUR

#do something

now I need data back as a nested JSON or dict like:现在我需要将数据返回为嵌套的 JSON 或像这样的字典:

json_data = [
    {
        "item": "Item1",
        "lowestPrice": {
            "price": 11.00,
            "currency": "EUR",
        },
        "annotatePrice": 15.00
    },
    {
        "item": "Item2",
        "lowestPrice": {
            "price": 12.00,
            "currency": "EUR",
        },
        "annotatePrice": 15.00
    },
    {
        "item": "Item3",
        "lowestPrice": {
            "price": 13.00,
            "currency": "EUR",
        },
        "annotatePrice": 15.00
    }
]

First, I added the column annotatePrice to the dataframe.首先,我在 dataframe 中添加了annotatePrice列。 Then constructed the inner dictionary for lowestPrice , followed by the outer dictionary.然后为lowestPrice构建内部字典,然后是外部字典。 I sourced my solution from this stack answer .我从这个堆栈答案中获取了我的解决方案。

Below is the dataframe after adding annotatePrice column.下面是添加annotatePrice列后的 dataframe。

在此处输入图像描述

Conversion code:转换代码:

df = pd.json_normalize(json_data, max_level=2)
df['annotatePrice'] = 15

json_data = (df.groupby(['item', 'annotatePrice'])
       .apply(lambda x: x[['lowestPrice.price', 'lowestPrice.currency']].rename(columns={"lowestPrice.price":'price', "lowestPrice.currency":'currency'}).to_dict('records')[0])
       .reset_index()
       .rename(columns={0:'lowestPrice'})
       .to_dict(orient='records'))

json_data

Output: Output:

[
  {
        'annotatePrice': 15,
        'item': 'Item1',
        'lowestPrice': {
            'currency': 'EUR',
            'price': 11.0
        }
    },
    {
        'annotatePrice': 15,
        'item': 'Item2',
        'lowestPrice': {
            'currency': 'EUR',
            'price': 12.0
        }
    },
    {
        'annotatePrice': 15,
        'item': 'Item3',
        'lowestPrice': {
            'currency': 'EUR',
            'price': 13.0
        }
    }
]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM