按列值分组数据

Question

Hi I have data (in excel and text file as well) like 嗨，我有数据（在Excel和文本文件中也是如此），例如

C1   C2    C3
 1    p     a
 1    q     b
 2    r     c
 2    s     d

And I want the output like: 我想要这样的输出：

C1   C2   C3
 1   p,q  a,b
 2   r,s  c,d

How can I group the data based on column values. 如何根据列值对数据进行分组。 I am open to anything: any library, any language, any tool Like python, bash, or even excel? 我对任何事物都开放：任何库，任何语言，任何工具（例如python，bash甚至是excel）？

I think we can do this using pandas in python, but I havent used it before. 我认为我们可以在python中使用pandas来做到这一点，但是我以前从未使用过。

Any leads appreciated. 任何线索表示赞赏。

Answer 1

First pandas.read_excel - output is DataFrame : 第一个pandas.read_excel输出为DataFrame ：

df = pd.read_excel('file.xlsx')

Then you can use groupby with agg join : 然后，您可以将groupby与agg join一起使用：

df = df.groupby('C1').agg(','.join).reset_index()
print (df)
   C1   C2   C3
0   1  p,q  a,b
1   2  r,s  c,d

If more columns in df and need filter only C2 and C3 : 如果df更多列并且仅需要过滤C2和C3 ：

df = df.groupby('C1')['C2','C3'].agg(','.join).reset_index()
print (df)
   C1   C2   C3
0   1  p,q  a,b
1   2  r,s  c,d

For save to excel file use DataFrame.to_excel , obviously without index : 要保存到excel文件，请使用DataFrame.to_excel ，显然没有index ：

df.to_excel('file.xlsx', index=False)

按列值分组数据

问题描述

1 个解决方案

解决方案1
3 2017-02-16 06:49:15

按列值分组数据

问题描述

1 个解决方案

解决方案1 3 2017-02-16 06:49:15

解决方案1
3 2017-02-16 06:49:15