[英]Pandas get values of column based on groupby count
I have a dataframe named test like so 我有一个名为test的数据框,像这样
ALT_K1 ALT_K2 ALT_K3 HS VS
1 A 1 45 2
1 A 1 32 32
1 1-1 70 1 1
1 1-1 70 0 9
1 A 2 3 0
and I groupby the first three columns and calculate the frequency of occurence like so 我对前三列进行分组,然后计算发生频率,如下所示
test_frequency = test.groupby(['ALT_K1', 'ALT_K2', 'ALT_K3']).size().reset_index(name='count')
I want to be able to get the values of the columns HS and VS given the number of times the combination of the three columns appears. 给定三列组合出现的次数,我希望能够获得HS和VS列的值。 For example, for the combination (1, A, 1) I want the to get the values of HS [45, 32] and VS [2, 32] 例如,对于组合(1,A,1),我希望获得HS [45,32]和VS [2,32]的值
Being stuck in this one for two days now and would appreciate any help. 现在被困在其中两天了,不胜感激。
Thanks 谢谢
I think you need custom lambda function with apply
and unique
: 我认为您需要具有apply
和unique
自定义lambda函数:
test_frequency = test.groupby(['ALT_K1', 'ALT_K2', 'ALT_K3'])
.apply(lambda x: pd.Series([x['HS'].unique(),
x['VS'].unique()], index=['HS','VS']))
.reset_index()
print (test_frequency)
print (test_frequency)
ALT_K1 ALT_K2 ALT_K3 HS VS
0 1 1-1 70 [1, 0] [1, 9]
1 1 A 1 [45, 32] [2, 32]
2 1 A 2 [3] [0]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.