简体   繁体   English

Pandas中的Dataframe变换

[英]Dataframe transformation in Pandas

I am trying to get frequency counts of certain columns in a dataframe in python.我正在尝试获取 python 中 dataframe 中某些列的频率计数。 I have a dataframe that looks like this我有一个看起来像这样的 dataframe

Animal动物 Body Part 1身体部分 1 Body Part 2身体部分 2 Body Part 3身体部分 3
Monkey Tail尾巴 Head Legs
Elephant大象 Head Tail尾巴 Trunk树干
Monkey Ears耳朵 Head Legs
Elephant大象 Eyes眼睛 Tail尾巴 Legs

The output I am looking for is to get the count of each body part for the corresponding animals (shown below).The values of the different body parts become the rows and the unique animals become the columns, with each cell denoting the count of occurrence of the body part in that animal.我正在寻找的 output 是获取相应动物的每个身体部位的计数(如下所示)。不同身体部位的值成为行,唯一的动物成为列,每个单元格表示发生的计数那只动物的身体部位。 It is a form of a pivot table but not sure what is the right method to apply here in python.它是 pivot 表的一种形式,但不确定在 python 中应用的正确方法是什么。


      | Monkey| Elephant
-------------------------
Tail  | 1     | 2
Head  | 2     | 1
Legs  | 2     | 1
Ears  | 1     | 0
Trunk | 0     | 1      

One way is to melt the data, then groupby().value_counts()一种方法是融合数据,然后groupby().value_counts()

(df.melt('Animal')
   .groupby('Animal')
   ['value'].value_counts()
   .unstack('Animal', fill_value=0)
)

Output: Output:

Animal  Elephant   Monkey 
value                     
Ears            0        1
Eyes            1        0
Head            1        2
Legs            1        2
Tail            2        1
Trunk           1        0

Option 2: Similar to option 1 with set_index().stack() instead of melt :选项 2:类似于选项 1,使用set_index().stack()而不是melt

(df.set_index('Animal')
   .stack().groupby(level=0)
   .value_counts()
   .unstack(level=0, fill_value=0)
)

Option 3: similar to option 1 but with pd.crosstab :选项 3:类似于选项 1,但使用pd.crosstab

tmp = df.melt('Animal')
out = pd.crosstab(tmp['value'], tmp['Animal'])

Option 4: apply Series.value_counts on the rows:选项 4:在行上应用Series.value_counts

(df.set_index('Animal')
   .apply(pd.Series.value_counts, axis=1)
   .sum(level=0).T
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM