简体   繁体   English

在给定列中将具有相同值的pandas DataFrame的所有行分组(具有许多列)

[英]Grouping all rows of a pandas DataFrame(with many columns) with the same value in a given column

I have been searching for hours.I have a DataFrame like so :- 我一直在寻找小时。我有一个像这样的DataFrame:-

     col1.  col2.   col3.   col4
row1.  a.    p       u       0
row2.  b.    q       v       1
row3.  a.    r       w       2
row4.  d.    s       x       3
row5.  b.    t       y       4

Now I want to group all this rows by the value of 'col1' so that I get :- 现在,我想将所有这些行按'col1'的值进行分组,以便得到:-

     col1.  col2.   col3.   col4
row1.  a.    p r     u w    0,2
row2.  b.    q t     v y    1,4
row3.  d.    s       x       3

Now I found a way where df.groupby('col1)['col2'].apply(' '.join()) would group all rows in 'col2' by the same value of 'col1'.But I am unable to extend the above command such that all rows of all columns are grouped together to get the output mentioned earlier. 现在我找到了一种方法df.groupby('col1)['col2'].apply(' '.join())将'col2'中的所有行按相同的'col1'值分组。但是我无法扩展上述命令,以便将所有列的所有行组合在一起以获得前面提到的输出。


The above DataFrame is just for illustration.The actual DataFrame includes around 100 rows and columns and all cells store feedbacks except for col1 which stores the name of the item for which the feedback is on.I want to group all columns on the basis of the same items(col1) and then I will be performing sentimental analysis on the DataFrame. 上面的DataFrame只是为了说明。实际的DataFrame包括大约100行和列,并且所有单元格都存储反馈,但col1除外,col1存储了对其进行反馈的项目的名称。我想基于相同的项目(col1),然后我将对DataFrame进行情感分析。

You can use: 您可以使用:

df1 = df.astype(str).groupby('col1').agg(','.join).reset_index()
print (df1)
  col1 col2 col3 col4
0   a.  p,r  u,w  0,2
1   b.  q,t  v,y  1,4
2   d.    s    x    3

If need also indices: 如果需要还可以索引:

df1 = df.astype(str).groupby('col1').agg(','.join).reset_index()
df1.index = df.drop_duplicates('col1').index
print (df1)
      col1 col2 col3 col4
row1.   a.  p,r  u,w  0,2
row2.   b.  q,t  v,y  1,4
row4.   d.    s    x    3

Explanation : 说明

  1. First cast all columns to string s by astype 首先将所有列按astypestring s
  2. Then groupby and aggregate join by agg 然后groupby和聚合joinagg
  3. If need also indices by first values on col1 add drop_duplicates 如果需要,还可以按col1上的第一个值进行索引,然后添加drop_duplicates

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在多列上将具有相同列值的行分组 - Grouping rows with same column value on multiple columns json 到 Pandas 数据框的第一列值在所有行中都相同 - json to pandas dataframe with first column value to be same across all rows 除了对 Pandas 中的列进行分组之外,是否可以在 dataframe 中形成具有列中值的行的组? - Is it possible to form groups in a dataframe with rows having a value in a column in addition to grouping columns in Pandas? 删除按 Pandas Dataframe 中的其他列分组的列中具有最低频率值的行 - Remove rows with the least frequent value in a column grouping by other columns in a Pandas Dataframe 过滤 pandas DataFrame 中的所有行,其中给定值介于两列值之间 - Filter all rows in a pandas DataFrame where a given value is between two column values 如果列行中的值为零,则删除 pandas dataframe 中所有列中的所有行 - Drop all rows in all columns in pandas dataframe if the value in the column row is zero 在 Pandas 中将多列分组为一列 - Grouping many columns in one column in Pandas 如何遍历 pandas dataframe 并找到具有给定值的所有行 - How to iterate through a pandas dataframe and find all rows with given value 对 pandas dataframe 中具有关闭时间戳的所有行进行分组 - Grouping all the rows with close timestamps in pandas dataframe 按熊猫数据框的索引对所有列的值进行分组 - Grouping the values of all columns by index of a pandas dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM