簡體   English   中英

拆分包含字典列表的熊貓列

[英]splitting pandas column containing list of dicts

我有一個熊貓列,列中的每個單元格都包含一個帶有每張照片顏色屬性的字典列表,例如:

[{'color': 'black', 'confidence': 1.0}, {'color': 'brown', 'confidence': 0.72}, {'color': 'gray', 'confidence': 0.62}, {'color': 'other', 'confidence': 0.52}, {'color': 'red', 'confidence': 0.01}, {'color': 'blond', 'confidence': 0.01}, {'color': 'white', 'confidence': 0.0}]

我希望能夠將此包含字典列表的列拆分為多個新的 Pandas 列。 例如,我想要一個名為“black”的列,值為“1.0”,一個名為“brown”的列,值為“0.72”等。

我正在努力完成這件事。 將欣賞提示。 謝謝!

a = [{'color': 'black', 'confidence': 1.0}, {'color': 'brown', 'confidence': 0.72}, {'color': 'gray', 'confidence': 0.62}, {'color': 'other', 'confidence': 0.52}, {'color': 'red', 'confidence': 0.01}, {'color': 'blond', 'confidence': 0.01}, {'color': 'white', 'confidence': 0.0}]

c= []
co = []
for d in a:
    c.append(d['color'])
    co.append(d['confidence'])
    
df = pd.DataFrame()
df['color'] = c
df['confidence'] = co

df = df.transpose()
#make the first column header
df.columns = df.iloc[0]
df = df[1:]
Output:
df
Out[159]: 
color      black brown  gray other   red blond white
confidence     1  0.72  0.62  0.52  0.01  0.01     0
'''

If this answer is correct, kindly accept and upvote the answer. Else, comment the doubt or issue, I would be happy to help

咱們試試吧:

pd.DataFrame(df['col'].tolist()).set_index('color').T

輸出:

color       black  brown  gray  other   red  blond  white
confidence    1.0   0.72  0.62   0.52  0.01   0.01    0.0

謝謝大家。 這對我有用。 我的靈感來自 Tejas 的回答:

from ast import literal_eval

df["black"]=""
df["brown"]=""
df["gray"]=""
df["other"]=""
df["red"]=""
df["blond"]=""
df["white"]=""

for k,v in df.iterrows():
    res = literal_eval(df["Color_list"][k])
    for d in res:
         df[d["color"]][k]=d["confidence"]

您可以使用帶有apply的自定義函數返回一個Series來執行此操作:

數據

import pandas as pd
import numpy as np

np.random.seed(0)

df = pd.DataFrame(
    {
        "A": ["a", "b"],
        "B": [
            [
                {"color": "black", "confidence": 1.0},
                {"color": "brown", "confidence": 0.72},
                {"color": "gray", "confidence": 0.62},
                {"color": "other", "confidence": 0.52},
                {"color": "red", "confidence": 0.01},
                {"color": "blond", "confidence": 0.01},
                {"color": "white", "confidence": 0.0},
            ],
            [
                {"color": "black", "confidence": 0.8},
                {"color": "brown", "confidence": 0.5},
                {"color": "gray", "confidence": 0.4},
                {"color": "other", "confidence": 0.32},
                {"color": "red", "confidence": 0.11},
            ],
        ],
    }
)

print(df)
   A                                                  B
0  a  [{'color': 'black', 'confidence': 1.0}, {'colo...
1  b  [{'color': 'black', 'confidence': 0.8}, {'colo...

方法由於每個單元格都是一個字典列表,我們需要將每個單元格變成它自己的Series ,其中索引是"color" ,值是"confidence" apply負責將這些Series對象粘在一起並輸出一個新的DataFrame

def clean_cell(records, index, values):
    return (pd.DataFrame(records)
            .set_index(index)
            .rename_axis(None)
            [values])

record_df = df["B"].apply(clean_cell, args=("color", "confidence"))

print(record_df)
   black  brown  gray  other   red  blond  white
0    1.0   0.72  0.62   0.52  0.01   0.01    0.0
1    0.8   0.50  0.40   0.32  0.11    NaN    NaN

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM