简体   繁体   English

将包含 dict 列表的列更改为 DataFrame 中的列

[英]Change a column containing list of dict to columns in a DataFrame

I have the following DataFrame which contains a column that is a list of dict items:我有以下 DataFrame 包含一个列,该列是 dict 项的列表:

    d = pd.DataFrame([
        ['Green', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
        ['Apply', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
        ['Range', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
        ['Peop',  [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]]
        ], columns=['Name', 'Legal Description'])

and I want to transform it to a simple DataFrame like so:我想把它转换成一个简单的 DataFrame 像这样:

    d = pd.DataFrame([
        ['Green', 'STERLING GREEN SO', '01', 'L0038', 'B0008'],
        ['Apply', 'STERLING GREEN SO', '01', 'L0038', 'B0008'],
        ['Range', 'STERLING GREEN SO', '01', 'L0038', 'B0008'],
        ['Peop',  'STERLING GREEN SO', '01', 'L0038', 'B0008']
        ], columns=['Name', 'Legal Description', 'Desc', 'Sec', 'Lot', 'Block'])

IMO, the ideal solution would be to act upstream and get a properly formatted dictionary or dataframe. IMO,理想的解决方案是在上游采取行动并获得格式正确的字典或 dataframe。

The issue with your list of single-keyed dictionaries is that you have to merge them.您的单键字典列表的问题是您必须合并它们。 You can use a dictionary comprehension for that and convert to Series:您可以为此使用字典理解并转换为系列:

d2 = d['Legal Description'].apply(lambda c:
                                  pd.Series({next(iter(x.keys())).strip(':'):
                                             next(iter(x.values())) for x in c})
                                  )

Then join to the original dataframe:然后加入原dataframe:

d.drop(columns='Legal Description').join(d2)

output: output:

    Name               Desc Sec    Lot  Block
0  Green  STERLING GREEN SO  01  L0038  B0008
1  Apply  STERLING GREEN SO  01  L0038  B0008
2  Range  STERLING GREEN SO  01  L0038  B0008
3   Peop  STERLING GREEN SO  01  L0038  B0008

If possible, you should wrangle your data before creating the DataFrame.如果可能,您应该在创建 DataFrame 之前整理您的数据。 It's faster than reshaping the DataFrame after being created.它比创建后重新塑造 DataFrame 更快。 For instance, something like例如,像

data = [
    ['Green', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
    ['Apply', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
    ['Range', [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]],
    ['Peop',  [{'Desc:': 'STERLING GREEN SO'}, {'Sec:': '01'}, {'Lot:': 'L0038'}, {'Block:': 'B0008'}]]
]

records = []

for row in data:
    rec = {}
    name, legal_desc = row
    rec['Name'] = name
    rec.update(x for d in legal_desc for x in d.items())
    records.append(rec)
    
d = pd.DataFrame(records)

Output: Output:

>>> d

    Name              Desc: Sec:   Lot: Block:
0  Green  STERLING GREEN SO   01  L0038  B0008
1  Apply  STERLING GREEN SO   01  L0038  B0008
2  Range  STERLING GREEN SO   01  L0038  B0008
3   Peop  STERLING GREEN SO   01  L0038  B0008

>>> records 

[{'Name': 'Green', 'Desc:': 'STERLING GREEN SO', 'Sec:': '01', 'Lot:': 'L0038', 'Block:': 'B0008'}, {'Name': 'Apply', 'Desc:': 'STERLING GREEN SO', 'Sec:': '01', 'Lot:': 'L0038', 'Block:': 'B0008'}, {'Name': 'Range', 'Desc:': 'STERLING GREEN SO', 'Sec:': '01', 'Lot:': 'L0038', 'Block:': 'B0008'}, {'Name': 'Peop', 'Desc:': 'STERLING GREEN SO', 'Sec:': '01', 'Lot:': 'L0038', 'Block:': 'B0008'}]

You can also use:您还可以使用:

df.set_index('Name', inplace=True)
df =  df['Legal Description'].explode().apply(pd.Series).groupby(level=0).sum().reset_index()

OUTPUT

    Name              Desc: Sec:   Lot: Block:
0  Apply  STERLING GREEN SO   01  L0038  B0008
1  Green  STERLING GREEN SO   01  L0038  B0008
2   Peop  STERLING GREEN SO   01  L0038  B0008
3  Range  STERLING GREEN SO   01  L0038  B0008

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 pandas dataframe 列中的列表中分离 dict 到不同的 dataframe 列 - separate dict from list in pandas dataframe column into different dataframe columns 将dict值(列表)作为列将python字典转换为数据框,如果该列在dict列表中则将1,0转换为数据框 - Convert python dictionary to dataframe with dict values(list) as columns and 1,0 if that column is in dict list 将列添加到包含其他列值列表的pandas DataFrame中 - Adding column to pandas DataFrame containing list of other columns' values 将包含字符串内的列表的数据框列拆分为两列 - Split dataframe column containing list inside a string to two columns 将包含字典列表的列转换为pandas dataframe中的多个列 - Convert a column containing a list of dictionaries to multiple columns in pandas dataframe Python:将包含列表和值的 pandas dataframe 列拆分为两列 - Python: Split pandas dataframe column containing a list and a value into two columns 将dict的pandas dataframe列展开为dataframe列 - Expand pandas dataframe column of dict into dataframe columns 创建 dataframe 作为 dict 列的列表 - Create dataframe as list of dict column 创建包含两个包含列表的熊猫 df 列的字典的列 - Create column containing the dict of two pandas df columns containing lists 从包含路径的 dataframe 列中提取字典到 dataframe - Extracting dict to dataframe from dataframe column containing paths
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM