如何将字典列表转换为数据框？

Question

This is a list of dictionaries that I have which is to be converted to a dataframe.这是我要转换为数据框的字典列表。 I tried using multi-index but couldn't convert the whole dataframe.我尝试使用多索引但无法转换整个数据帧。

response = [{
"name": "xyz",
"empId": "007",
"details": [{
        "address": [{
            "street": "x street",
            "city": "x city"
        }, {
            "street": "xx street",
            "city": "xx city"
        }],
        "country": "xxz country"
    },
    {
        "address": [{
            "street": "y street",
            "city": "y city"
        }, {
            "street": "yy street",
            "city": "yy city"
        }],
        "country": "yyz country"
    }
]
}]

I managed to get the inner list of dictionaries to a dataframe with the following code:我设法使用以下代码将字典的内部列表获取到数据框：

for i in details:    
    Country = i['country']

    street =[]
    city = []
    index = pd.MultiIndex.from_arrays([[Country]*len(i['address']), list(range(1,len(i['address'])+1))], names=['Country', 'SL No'])
    df=pd.DataFrame(columns=["Street","City"],index=index)
    if i['address']:
        for row in i['address']:
            street.append(row['street'])
            city.append(row['city'])

    df["Street"]=street
    df["City"]=city

    frames.append(df)
df_final=pd.concat(frames)

Output obtained:获得的输出：

Country     SL No   Street     City
xxz country 1       x street   x city
            2      xx street  xx city
yyz country 1       y street   y city
            2      yy street  yy city

How can I convert the list of dictionaries to a dataframe while keeping all the information?如何在保留所有信息的同时将字典列表转换为数据框？

The final output that I want:我想要的最终输出：

Name    EmpId    Country        Street     City
xyz     007      xxz country    x street   x city
                                xx street  xx city
                 yyz country    y street   y city
                                yy street  yy cit

Answer 1

Use json_normalize with DataFrame.set_index :将json_normalize与DataFrame.set_index json_normalize使用：

df = pd.json_normalize(response,
                       record_path=['details','address'],
                       meta=['name','empId', ['address','country']]
                       )

df = df.set_index(['name','empId','address.country'])
print (df)
                               street     city
name empId address.country                    
xyz  007   xxz country       x street   x city
           xxz country      xx street  xx city
           yyz country       y street   y city
           yyz country      yy street  yy city

For older pandas versions use:对于较旧的熊猫版本，请使用：

df = pd.io.json.json_normalize(response,
                               record_path=['details','address'],
                               meta=['name','empId', ['address','country']]
                       )

EDIT:编辑：

Tested with multiple values and working well:使用多个值进行测试并且运行良好：

response = [{
"name": "xyz",
"empId": "007",
"details": [{
        "address": [{
            "street": "x street",
            "city": "x city"
        }, {
            "street": "xx street",
            "city": "xx city"
        }],
        "country": "xxz country"
    },
    {
        "address": [{
            "street": "y street",
            "city": "y city"
        }, {
            "street": "yy street",
            "city": "yy city"
        }],
        "country": "yyz country"
    }
]
},
            {
"name": "xyz1",
"empId": "0071",
"details": [{
        "address": [{
            "street": "x street1",
            "city": "x city1"
        }, {
            "street": "xx stree1t",
            "city": "xx city1"
        }],
        "country": "xxz country"
    },
    {
        "address": [{
            "street": "y street",
            "city": "y city"
        }, {
            "street": "yy street",
            "city": "yy city"
        }],
        "country": "yyz country"
    }
]
}]

df = pd.json_normalize(response,
                       record_path=['details','address'],
                       meta=['name','empId', ['address','country']]
                       )

df = df.set_index(['name','empId','address.country'])

print (df)
                                street      city
name empId address.country                      
xyz  007   xxz country        x street    x city
           xxz country       xx street   xx city
           yyz country        y street    y city
           yyz country       yy street   yy city
xyz1 0071  xxz country       x street1   x city1
           xxz country      xx stree1t  xx city1
           yyz country        y street    y city
           yyz country       yy street   yy city

Answer 2

As far as I know, there is no easy way to do it since your data contains multiple levels of lists.据我所知，没有简单的方法可以做到，因为您的数据包含多个级别的列表。 Although a bit convoluted, the following should work.虽然有点令人费解，但以下应该有效。 The code will iteratively explode lists and convert dictionaries to columns with json_normalize .该代码将反复explode列表和转换字典的列与json_normalize 。

df = pd.DataFrame.from_records(response)
df = df.explode('details', ignore_index=True)
df = pd.concat([df, pd.json_normalize(df['details'])], axis=1)
df = df.explode('address', ignore_index=True)
df = pd.concat([df, pd.json_normalize(df['address'])], axis=1)
df = df.drop(columns=['details', 'address'])

Result:结果：

  name empId      country     street     city
0  xyz   007  xxz country   x street   x city
1  xyz   007  xxz country  xx street  xx city
2  xyz   007  yyz country   y street   y city
3  xyz   007  yyz country  yy street  yy city

Note: For pandas versions older than 1.1.0, explode do not have the ignore_index parameter.注意：对于 1.1.0 之前的ignore_index版本， explode没有ignore_index参数。 Instead, use reset_index(drop=True) after the explode .相反，在explode后使用reset_index(drop=True) 。

In addition, in older pandas versions you need to use pd.io.json.json_normalize instead of pd.json_normalize .此外，在较旧的pd.io.json.json_normalize版本中，您需要使用pd.io.json.json_normalize而不是pd.json_normalize 。

如何将字典列表转换为数据框？

问题描述

2 个解决方案

解决方案1
3 2020-11-20 07:03:55

解决方案2
0 2020-11-20 06:25:21

如何将字典列表转换为数据框？

问题描述

2 个解决方案

解决方案1 3 2020-11-20 07:03:55

解决方案2 0 2020-11-20 06:25:21

解决方案1
3 2020-11-20 07:03:55

解决方案2
0 2020-11-20 06:25:21