简体   繁体   English

熊猫根据groupby计数获取列的值

[英]Pandas get values of column based on groupby count

I have a dataframe named test like so 我有一个名为test的数据框,像这样

   ALT_K1 ALT_K2  ALT_K3  HS  VS
    1       A       1      45   2
    1       A       1      32  32
    1      1-1      70     1   1
    1      1-1      70     0   9
    1      A        2      3   0

and I groupby the first three columns and calculate the frequency of occurence like so 我对前三列进行分组,然后计算发生频率,如下所示

test_frequency = test.groupby(['ALT_K1', 'ALT_K2', 'ALT_K3']).size().reset_index(name='count')

I want to be able to get the values of the columns HS and VS given the number of times the combination of the three columns appears. 给定三列组合出现的次数,我希望能够获得HS和VS列的值。 For example, for the combination (1, A, 1) I want the to get the values of HS [45, 32] and VS [2, 32] 例如,对于组合(1,A,1),我希望获得HS [45,32]和VS [2,32]的值

Being stuck in this one for two days now and would appreciate any help. 现在被困在其中两天了,不胜感激。

Thanks 谢谢

I think you need custom lambda function with apply and unique : 我认为您需要具有applyunique自定义lambda函数:

test_frequency = test.groupby(['ALT_K1', 'ALT_K2', 'ALT_K3'])
                     .apply(lambda x: pd.Series([x['HS'].unique(),
                                                 x['VS'].unique()], index=['HS','VS']))
                     .reset_index()
print (test_frequency)

print (test_frequency)
   ALT_K1 ALT_K2  ALT_K3        HS       VS
0       1    1-1      70    [1, 0]   [1, 9]
1       1      A       1  [45, 32]  [2, 32]
2       1      A       2       [3]      [0]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM