[英]Converting Dataframe into dictionary of lists
我正在处理需要将 dataframe 转换为列表字典的情况。 示例 dataframe 如下:
我想将上面的 dataframe 转换为列表字典,如下所示:
dict = {"abc":[第 1 句,第 2 句],"def":[第 3 句],"ghi":[第 4 句,第 5 句]}
这是我的解决方案:
dict = {}
for idx, row in test_df.iterrows():
if not row["label"] in dict:
dict[row["label"]] = []
else:
continue
for key in dict:
dict[key] = list()
for idx, row in test_df.iterrows():
if key == row["label"]:
dict[key].append(row["sentence"])
else:
continue
print(dict)
我的解决方案有效,但它看起来像很多代码,应该有一个简单的出路。 有什么建议么?
data = pd.DataFrame([
{"sentence": "sentence1", "label":"abc"},
{"sentence": "sentence2", "label":"abc"},
{"sentence": "sentence3", "label":"def"},
{"sentence": "sentence4", "label":"ghi"},
{"sentence": "sentence5", "label":"ghi"},
])
data
sentence label
0 sentence1 abc
1 sentence2 abc
2 sentence3 def
3 sentence4 ghi
4 sentence5 ghi
data.groupby("label")["sentence"].apply(list).reindex().to_dict()
{'abc': ['sentence1', 'sentence2'],
'def': ['sentence3'],
'ghi': ['sentence4', 'sentence5']}
您可以使用groupby
,如下所示:
import pandas as pd
df = pd.DataFrame(
{
'sentence': ['sentence1', 'sentence2', 'sentence3', 'sentence4', 'sentence5'],
'label': ['abc', 'abc', 'def', 'ghi', 'ghi']
}
)
df = df.groupby('label')['sentence'].apply(list)
print({k: v for k, v in df.items()})
Output:
{'abc': ['sentence1', 'sentence2'], 'def': ['sentence3'], 'ghi': ['sentence4', 'sentence5']}
import pandas as pd
df = pd.DataFrame({'sentence':['10','20','30','40','50'], 'label' : ['abc', 'abc', 'def', 'ghi', 'ghi']})
d = {key: list(df.where(df.label == key).sentence.dropna().values) for key in set(df.label)}
使用 dict 理解
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.