处理熊猫中的嵌套列表

Question

How can I turn a nested list with dict inside into extra columns in a dataframe in Python?如何将带有 dict 的嵌套列表转换为 Python 数据框中的额外列？

I received information within a dict from an API,我从 API 的 dict 中收到信息，

{'orders': 
[
{   'orderId': '2838168630', 
    'dateTimeOrderPlaced': '2020-01-22T18:37:29+01:00', 
    'orderItems': [{    'orderItemId':  'BFC0000361764421', 
                        'ean': '234234234234234', 
                        'cancelRequest': False, 
                        'quantity': 1}
                        ]}, 

{   'orderId': '2708182540', 
    'dateTimeOrderPlaced': '2020-01-22T17:45:36+01:00', 
    'orderItems': [{    'orderItemId':  'BFC0000361749496', 
                        'ean': '234234234234234', 
                        'cancelRequest': False, 
                        'quantity': 3}
                        ]}, 

{   'orderId': '2490844970', 
    'dateTimeOrderPlaced': '2019-08-17T14:21:46+02:00', 
    'orderItems': [{    'orderItemId': 'BFC0000287505870', 
                        'ean': '234234234234234', 
                        'cancelRequest': True, 
                        'quantity': 1}
                        ]}

which I managed to turn into a simple dataframe by doing this:通过这样做，我设法将其变成了一个简单的数据框：

pd.DataFrame(recieved_data.get('orders'))

output:输出：

orderId    date    oderItems
1          1-12    [{orderItemId: 'dfs13', 'ean': '34234'}]
2          etc.
...

I would like to have something like this我想要这样的东西

orderId    date    oderItemId    ean
1          1-12    dfs13         34234
2          etc.
...

I already tried to single out the orderItems column with Iloc and than turn it into a list so I can then try to extract the values again.我已经尝试使用 Iloc 挑出 orderItems 列，然后将其转换为列表，以便我可以再次尝试提取值。 However I than still end up with a list which I need to extract another list from, which has the dict in it.但是，我最终还是得到了一个列表，我需要从中提取另一个列表，其中包含 dict。

Answer 1

# Load the dataframe as you have already done.

temp_df = df['orderItems'].apply(pd.Series)

# concat the temp_df and original df

final_df = pd.concat([df, temp_df])

# drop columns if required

Hope it works for you.希望对你有效。

Cheers干杯

Answer 2

By combining the answers on this question I reached my end goal.通过结合这个问题的答案，我达到了我的最终目标。 I dit the following:我点了以下内容：

#unlist the orderItems column
temp_df = df['orderItems'].apply(pd.Series)

#Put items in orderItems into seperate columns
temp_df_json = json_normalize(temp_df[0])

#Join the tables
final_df = df.join(temp_df_json)

#Drop the old orderItems coloumn for a clean table
final_df = final_df.drop(["orderItems"], axis=1)

Also, instead of .concat() I applied .join() to join both tables based on the existing index.此外，我应用 .join() 代替 .concat() 根据现有索引连接两个表。

Answer 3

Just to make it clear, you are receiving a json from the API, so you can try to use the function json_normalize .为了清楚json_normalize ，您正在从 API 接收一个 json，因此您可以尝试使用函数json_normalize 。 Try this:尝试这个：

import pandas as pd
from pandas.io.json import json_normalize
# DataFrame initialization
df = pd.DataFrame({"orderId": [1], "date": ["1-12"], "oderItems": [{ 'orderItemId': 'dfs13', 'ean': '34234'}]})

# Serializing inner dict
sub_df = json_normalize(df["oderItems"])

# Dropping the unserialized column
df = df.drop(["oderItems"], axis=1)

# joining both dataframes.
df.join(sub_df)

So the output is:所以输出是：

    orderId date    ean     orderItemId
0   1       1-12    34234   dfs13

处理熊猫中的嵌套列表

问题描述

3 个解决方案

解决方案1
1 2020-01-22 21:28:51

解决方案2
1 2020-01-23 15:39:53

解决方案3
0 2020-01-22 19:55:08

处理熊猫中的嵌套列表

问题描述

3 个解决方案

解决方案1 1 2020-01-22 21:28:51

解决方案2 1 2020-01-23 15:39:53

解决方案3 0 2020-01-22 19:55:08

解决方案1
1 2020-01-22 21:28:51

解决方案2
1 2020-01-23 15:39:53

解决方案3
0 2020-01-22 19:55:08