在给定列中将具有相同值的pandas DataFrame的所有行分组（具有许多列）

Question

I have been searching for hours.I have a DataFrame like so :- 我一直在寻找小时。我有一个像这样的DataFrame：-

     col1.  col2.   col3.   col4
row1.  a.    p       u       0
row2.  b.    q       v       1
row3.  a.    r       w       2
row4.  d.    s       x       3
row5.  b.    t       y       4

Now I want to group all this rows by the value of 'col1' so that I get :- 现在，我想将所有这些行按'col1'的值进行分组，以便得到：-

     col1.  col2.   col3.   col4
row1.  a.    p r     u w    0,2
row2.  b.    q t     v y    1,4
row3.  d.    s       x       3

Now I found a way where df.groupby('col1)['col2'].apply(' '.join()) would group all rows in 'col2' by the same value of 'col1'.But I am unable to extend the above command such that all rows of all columns are grouped together to get the output mentioned earlier. 现在我找到了一种方法df.groupby('col1)['col2'].apply(' '.join())将'col2'中的所有行按相同的'col1'值分组。但是我无法扩展上述命令，以便将所有列的所有行组合在一起以获得前面提到的输出。

The above DataFrame is just for illustration.The actual DataFrame includes around 100 rows and columns and all cells store feedbacks except for col1 which stores the name of the item for which the feedback is on.I want to group all columns on the basis of the same items(col1) and then I will be performing sentimental analysis on the DataFrame. 上面的DataFrame只是为了说明。实际的DataFrame包括大约100行和列，并且所有单元格都存储反馈，但col1除外，col1存储了对其进行反馈的项目的名称。我想基于相同的项目（col1），然后我将对DataFrame进行情感分析。

Answer 1

You can use: 您可以使用：

df1 = df.astype(str).groupby('col1').agg(','.join).reset_index()
print (df1)
  col1 col2 col3 col4
0   a.  p,r  u,w  0,2
1   b.  q,t  v,y  1,4
2   d.    s    x    3

If need also indices: 如果需要还可以索引：

df1 = df.astype(str).groupby('col1').agg(','.join).reset_index()
df1.index = df.drop_duplicates('col1').index
print (df1)
      col1 col2 col3 col4
row1.   a.  p,r  u,w  0,2
row2.   b.  q,t  v,y  1,4
row4.   d.    s    x    3

Explanation : 说明：

First cast all columns to string s by astype 首先将所有列按astype为string s
Then groupby and aggregate join by agg 然后groupby和聚合join由agg
If need also indices by first values on col1 add drop_duplicates 如果需要，还可以按col1上的第一个值进行索引，然后添加drop_duplicates

在给定列中将具有相同值的pandas DataFrame的所有行分组（具有许多列）

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-03-21 11:27:48

在给定列中将具有相同值的pandas DataFrame的所有行分组（具有许多列）

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-03-21 11:27:48

解决方案1
2 已采纳 2018-03-21 11:27:48