[英]pandas - construct column depending on values in 2 separate columns of dataframe
I have a pandas dataframe which looks similar to this(i have cooked up an example, since I can't share the data) 我有一个看起来与此类似的pandas数据框(我准备了一个示例,因为我无法共享数据)
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Scouts'],
'company': ['1st', '2nd', '1st', '2nd', '2nd'],
'thisValue': [1, 2, 3, 2, 7],
'total': [3, 3, 5, 5, 7]}
df = pd.DataFrame(raw_data, columns = ['regiment', 'company', 'thisValue', 'total'])
df
The output is : 输出为:
regiment company thisValue total
0 Nighthawks 1st 1 3
1 Nighthawks 2nd 2 3
2 Dragoons 1st 3 5
3 Dragoons 2nd 2 5
4 Scouts 2nd 7 7
I want to have statistics about count of values of eachValue for a regiment. 我想获得有关某团的eachValue值计数的统计信息。 That is I need the resulting dataframe to be like this: 那就是我需要结果数据框是这样的:
regiment 1stCompanyValue 2nd_Company_Value total
Nighthawks 1 2 3
Dragoons 3 2 5
Scouts 0 7 7
I tried grouping it on company values, but then not sure how to proceed. 我尝试将其按公司价值分组,但随后不确定如何进行。 How can this be done in pandas? 如何在大熊猫中做到这一点?
We can make use of pivot
, groupby
and concat
ie 我们可以利用pivot
, groupby
和concat
即
one = df.pivot(columns='company',values='thisValue',index='regiment').add_suffix('_company_value').fillna(0)
two = df.groupby('regiment')['total'].first()
ndf = pd.concat([one,two],1)
1st_company_value 2nd_company_value total
regiment
Dragoons 3.0 2.0 5
Nighthawks 1.0 2.0 3
Scouts 0.0 7.0 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.