如何將帶有 Json 的列轉換為 Pandas 中的 dataframe 列？

Question

我有這個 DataFrame，其中一個字段是 json：

       id               lastActionDate                                            clients
    0  26   2021-07-02T13:59:35.273662  [{'id': '123', 'personType': 1, 'profileType', 'businessName': 'Michael Corleone':...
    1  30  2021-07-24T15:44:38.2403574  [{'id': '456', 'personType': 1, 'profileType', 'businessName': 'Vito Corleone', :...

我想將此 json 轉換為列，如下面的預期結果：

       id               lastActionDate   id  personType  profileType      businessName
    0  26   2021-07-02T13:59:35.273662   123           1            2  Michael Corleone
    1  30  2021-07-24T15:44:38.2403574   456           1            2     Vito Corleone

我該怎么做這個代碼？

Answer 1

您可以做的是定義以下 function（適用於任何數據框）：

def flatten_nested_json_df(df):
    df = df.reset_index()
    s = (df.applymap(type) == list).all()
    list_columns = s[s].index.tolist()
    
    s = (df.applymap(type) == dict).all()
    dict_columns = s[s].index.tolist()

    
    while len(list_columns) > 0 or len(dict_columns) > 0:
        new_columns = []

        for col in dict_columns:
            horiz_exploded = pd.json_normalize(df[col]).add_prefix(f'{col}.')
            horiz_exploded.index = df.index
            df = pd.concat([df, horiz_exploded], axis=1).drop(columns=[col])
            new_columns.extend(horiz_exploded.columns) # inplace

        for col in list_columns:
            print(f"exploding: {col}")
            df = df.drop(columns=[col]).join(df[col].explode().to_frame())
            new_columns.append(col)

        s = (df[new_columns].applymap(type) == list).all()
        list_columns = s[s].index.tolist()

        s = (df[new_columns].applymap(type) == dict).all()
        dict_columns = s[s].index.tolist()
    return df

並做：

flatten_nested_json_df(your_df)

其中your_df是您發布的 dataframe。

示例：

df_f = {
    "school_name": "ABC primary school",
    "class": "Year 1",
    "students": [
    {
        "id": "A001",
        "name": "Tom",
        "math": 60,
        "physics": 66,
        "chemistry": 61
    },
    {
        "id": "A002",
        "name": "James",
        "math": 89,
        "physics": 76,
        "chemistry": 51
    },
    {
        "id": "A003",
        "name": "Jenny",
        "math": 79,
        "physics": 90,
        "chemistry": 78
    }]
}

和

df = pd.json_normalize(df_f)

這是

 school_name      class   \
0  ABC primary school  Year 1   

                       students                       
0  [{'id': 'A001', 'name': 'Tom', 'math': 60, 'ph...

並申請 function

flatten_nested_json_df(df)

將返回：

index     school_name      class  students.id students.name  students.math  \
0    0    ABC primary school  Year 1     A001           Tom          60         
0    0    ABC primary school  Year 1     A002         James          89         
0    0    ABC primary school  Year 1     A003         Jenny          79         

   students.physics  students.chemistry  
0         66                 61          
0         76                 51          
0         90                 78

Answer 2

你可以這樣做：

 df_nested_list = pd.json_normalize(data, record_path =['clients'])

如何將帶有 Json 的列轉換為 Pandas 中的 dataframe 列？

問題描述

2 個解決方案

解決方案1
0 2022-02-21 14:46:22

解決方案2
0 2022-02-21 15:04:02

如何將帶有 Json 的列轉換為 Pandas 中的 dataframe 列？

問題描述

2 個解決方案

解決方案1 0 2022-02-21 14:46:22

解決方案2 0 2022-02-21 15:04:02

解決方案1
0 2022-02-21 14:46:22

解決方案2
0 2022-02-21 15:04:02