[英]How to create a a column from a pandas dataframe with the repeated values in dictionary format
i'm very confused on how to do this, (i'm very newbie yet) and I need to convert this dataframe into a dictionary with a column for repeated values:我对如何执行此操作感到很困惑(我还是新手),我需要将此 dataframe 转换为包含重复值列的字典:
import pandas as pd
df = pd.DataFrame({'Name': [['John', 'hock'], ['John','pepe'],['Peter', 'wdw'],['Peter'],['John'], ['Stef'], ['John']],
'Age': [38, 47, 63, 28, 33, 45, 66]
})
and i need something like:我需要这样的东西:
Name Age Repeated:
John 38 4
thanks!谢谢!
Use DataFrame.explode
with GroupBy.size
:使用
DataFrame.explode
和GroupBy.size
:
df = df.explode('Name').groupby(['Name']).size().reset_index(name='Repeated')
print (df)
Name Repeated
0 John 4
1 Peter 2
2 Stef 1
3 hock 1
4 pepe 1
5 wdw 1
I can think of something like:我可以想到类似的东西:
resultDict = {}
for index, row in df.iterrows():
for value in row["Name"]:
if value not in resultDict:
resultDict[value] = 0
resultDict[value] += 1
resultDict
{'John': 4, 'Peter': 2, 'Stef': 1, 'hock': 1, 'pepe': 1, 'wdw': 1}
If you want to have it as a dataframe and not a dictionary:如果您想将其作为 dataframe 而不是字典:
resultDict = {}
for index, row in df.iterrows():
for value in row["Name"]:
if value not in resultDict:
resultDict[value] = 0
resultDict[value] += 1
pd.DataFrame({"Name":resultDict.keys(), "Repeated":resultDict.values()})
Name![]() |
Repeated![]() |
---|---|
John![]() |
4 ![]() |
hock![]() |
1 ![]() |
pepe![]() |
1 ![]() |
Peter![]() |
2 ![]() |
wdw ![]() |
1 ![]() |
Stef![]() |
1 ![]() |
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.