[英]splitting pandas column containing list of dicts
我有一個熊貓列,列中的每個單元格都包含一個帶有每張照片顏色屬性的字典列表,例如:
[{'color': 'black', 'confidence': 1.0}, {'color': 'brown', 'confidence': 0.72}, {'color': 'gray', 'confidence': 0.62}, {'color': 'other', 'confidence': 0.52}, {'color': 'red', 'confidence': 0.01}, {'color': 'blond', 'confidence': 0.01}, {'color': 'white', 'confidence': 0.0}]
我希望能夠將此包含字典列表的列拆分為多個新的 Pandas 列。 例如,我想要一個名為“black”的列,值為“1.0”,一個名為“brown”的列,值為“0.72”等。
我正在努力完成這件事。 將欣賞提示。 謝謝!
a = [{'color': 'black', 'confidence': 1.0}, {'color': 'brown', 'confidence': 0.72}, {'color': 'gray', 'confidence': 0.62}, {'color': 'other', 'confidence': 0.52}, {'color': 'red', 'confidence': 0.01}, {'color': 'blond', 'confidence': 0.01}, {'color': 'white', 'confidence': 0.0}]
c= []
co = []
for d in a:
c.append(d['color'])
co.append(d['confidence'])
df = pd.DataFrame()
df['color'] = c
df['confidence'] = co
df = df.transpose()
#make the first column header
df.columns = df.iloc[0]
df = df[1:]
Output:
df
Out[159]:
color black brown gray other red blond white
confidence 1 0.72 0.62 0.52 0.01 0.01 0
'''
If this answer is correct, kindly accept and upvote the answer. Else, comment the doubt or issue, I would be happy to help
咱們試試吧:
pd.DataFrame(df['col'].tolist()).set_index('color').T
輸出:
color black brown gray other red blond white
confidence 1.0 0.72 0.62 0.52 0.01 0.01 0.0
謝謝大家。 這對我有用。 我的靈感來自 Tejas 的回答:
from ast import literal_eval
df["black"]=""
df["brown"]=""
df["gray"]=""
df["other"]=""
df["red"]=""
df["blond"]=""
df["white"]=""
for k,v in df.iterrows():
res = literal_eval(df["Color_list"][k])
for d in res:
df[d["color"]][k]=d["confidence"]
您可以使用帶有apply
的自定義函數返回一個Series
來執行此操作:
數據
import pandas as pd
import numpy as np
np.random.seed(0)
df = pd.DataFrame(
{
"A": ["a", "b"],
"B": [
[
{"color": "black", "confidence": 1.0},
{"color": "brown", "confidence": 0.72},
{"color": "gray", "confidence": 0.62},
{"color": "other", "confidence": 0.52},
{"color": "red", "confidence": 0.01},
{"color": "blond", "confidence": 0.01},
{"color": "white", "confidence": 0.0},
],
[
{"color": "black", "confidence": 0.8},
{"color": "brown", "confidence": 0.5},
{"color": "gray", "confidence": 0.4},
{"color": "other", "confidence": 0.32},
{"color": "red", "confidence": 0.11},
],
],
}
)
print(df)
A B
0 a [{'color': 'black', 'confidence': 1.0}, {'colo...
1 b [{'color': 'black', 'confidence': 0.8}, {'colo...
方法由於每個單元格都是一個字典列表,我們需要將每個單元格變成它自己的Series
,其中索引是"color"
,值是"confidence"
。 apply
負責將這些Series
對象粘在一起並輸出一個新的DataFrame
def clean_cell(records, index, values):
return (pd.DataFrame(records)
.set_index(index)
.rename_axis(None)
[values])
record_df = df["B"].apply(clean_cell, args=("color", "confidence"))
print(record_df)
black brown gray other red blond white
0 1.0 0.72 0.62 0.52 0.01 0.01 0.0
1 0.8 0.50 0.40 0.32 0.11 NaN NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.