简体   繁体   English

python pandas迭代两列不同的行并返回重复的一次和单行重复值的对应值

[英]python pandas iterating rows of two different columns and returning the repeated one once and corresponding values of repeated values in single row

for instance, I have a .csv file with 1000s of rows like below:例如,我有一个包含 1000 行的 .csv 文件,如下所示:

year,name
1992,Alex
1992,Anna
1993,Max
1993,Bob
1993,Tom

so on...很快...

I want my output to be:我希望我的输出是:

   year           name
   1992     Alex, Anna
   1993  Max, Bob, Tom

this looks simple but I'm not able to make the corresponding rows in a single row appended by a comma ','这看起来很简单,但我无法在单行中添加相应的行,并附加一个逗号“,”

You can achieve this by using groupby and aggregation.您可以通过使用 groupby 和聚合来实现这一点。 Try the below code:试试下面的代码:

df = df.groupby("year").agg({
    "year":"first",
    "name":", ".join
                          })

You can save the dataframe values to csv by ignoring index您可以通过忽略索引将数据帧值保存到 csv

df.to_csv("output.csv",index=False)

This may help you这可能会帮助你

df = df.groupby('year')['name'].unique().reset_index()
df['name'] = df['name'].apply(lambda x: ', '.join(x))

Output:输出:

   year           name
0  1992     Alex, Anna
1  1993  Max, Bob, Tom

How about this one?这个怎么样?

import pandas as pd
x = pd.DataFrame.from_dict({'year':['1992', '1992', '1993', '1993', '1993'], 
                            'name':['ALEX', 'ANNA', 'MAX', 'BOB', 'TOM'],
                             'col':range(5)})
print (x)

a = x.groupby('year').agg({'name': lambda x: tuple(set(x)), 'col':'sum'})
print (a)

Result:结果:

                 name  col
year                      
1992     (ALEX, ANNA)    1
1993  (BOB, TOM, MAX)    9

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将具有相同 ID 但两列中的不同值的行分组为一行,将不同的值作为 Pandas 中的列? - How to group rows with same ID but different values in two columns into a single row the different values as columns in Pandas? 搜索与python中多个列中的值对应的重复字符串(最好使用pandas数据框) - Search for repeated strings corresponding to a values across multiple columns in python(preferably with pandas dataframe) 大熊猫将重复的值重新堆叠到列中 - Pandas restacking repeated values to columns 删除不同列中的重复值 - Remove repeated values in different columns Python pandas显示重复的值 - Python pandas show repeated values Python / Pandas:如何使用NaN合并不同行中的重复行? - Python/Pandas: How to consolidate repeated rows with NaN in different columns? Pandas:将两列合并为一列并具有相应的值 - Pandas: Merging two columns into one with corresponding values pandas 检查两列之间和一列内是否有重复的重复值 - pandas check if there are duplicates of repeated values between the two columns and not inside one column 将具有周期性重复标题的单列拆分为两列(Python) - Split one single column with periodic repeated headers into two columns (Python) Python Pandas连接或整形数据以添加两个具有重复值的新列 - Python Pandas join or shape data to add two new columns with repeated values
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM