python pandas迭代两列不同的行并返回重复的一次和单行重复值的对应值

Question

for instance, I have a .csv file with 1000s of rows like below:例如，我有一个包含 1000 行的 .csv 文件，如下所示：

year,name
1992,Alex
1992,Anna
1993,Max
1993,Bob
1993,Tom

so on...很快...

I want my output to be:我希望我的输出是：

   year           name
   1992     Alex, Anna
   1993  Max, Bob, Tom

this looks simple but I'm not able to make the corresponding rows in a single row appended by a comma ','这看起来很简单，但我无法在单行中添加相应的行，并附加一个逗号“,”

Answer 1

You can achieve this by using groupby and aggregation.您可以通过使用 groupby 和聚合来实现这一点。 Try the below code:试试下面的代码：

df = df.groupby("year").agg({
    "year":"first",
    "name":", ".join
                          })

You can save the dataframe values to csv by ignoring index您可以通过忽略索引将数据帧值保存到 csv

df.to_csv("output.csv",index=False)

Answer 2

This may help you这可能会帮助你

df = df.groupby('year')['name'].unique().reset_index()
df['name'] = df['name'].apply(lambda x: ', '.join(x))

Output:输出：

   year           name
0  1992     Alex, Anna
1  1993  Max, Bob, Tom

Answer 3

How about this one?这个怎么样？

import pandas as pd
x = pd.DataFrame.from_dict({'year':['1992', '1992', '1993', '1993', '1993'], 
                            'name':['ALEX', 'ANNA', 'MAX', 'BOB', 'TOM'],
                             'col':range(5)})
print (x)

a = x.groupby('year').agg({'name': lambda x: tuple(set(x)), 'col':'sum'})
print (a)

Result:结果：

                 name  col
year                      
1992     (ALEX, ANNA)    1
1993  (BOB, TOM, MAX)    9

python pandas迭代两列不同的行并返回重复的一次和单行重复值的对应值

问题描述

3 个解决方案

解决方案1
3 2020-02-28 23:18:53

解决方案2
2 2020-02-28 23:25:45

解决方案3
1 已采纳 2020-03-12 17:27:56

python pandas迭代两列不同的行并返回重复的一次和单行重复值的对应值

问题描述

3 个解决方案

解决方案1 3 2020-02-28 23:18:53

解决方案2 2 2020-02-28 23:25:45

解决方案3 1 已采纳 2020-03-12 17:27:56

解决方案1
3 2020-02-28 23:18:53

解决方案2
2 2020-02-28 23:25:45

解决方案3
1 已采纳 2020-03-12 17:27:56