[英]Map values on DataFrame by two columns
Suppose I am trying to map total_cases_age onto cases_by_age where dataframes are: 假设我试图将total_cases_age映射到case_by_age,其中数据帧为:
results_grouped_age = results_grouped[['Make', 'age', 'Test Result', 'Number of Cases']].copy()
cases_by_age = results_grouped_age[['Make','age','Test Result','Number of Cases']].groupby(['Make','age','Test Result']).sum().reset_index()
total_cases_age = cases_by_age.groupby(['Make','age'])['Number of Cases'].sum()
However whereas I would normally do: 但是,尽管我通常会这样做:
cases_by_age['Total Cases'] = cases_by_age['age'].map(total_cases_age)
Indices of total_cases_age is actually a combination of 'make and age' and this is actually what I want to do. total_cases_age的指标实际上是“年龄”的组合,这实际上是我想要做的。 To easier understand my problem, suppose I have table cases_by_age"
为了更容易理解我的问题,假设我有一个表case_by_age“
Make age Test Result Number of Cases
0 ALFA ROMEO 0-3 ABA 1
1 ALFA ROMEO 0-3 ABR NaN
2 ALFA ROMEO 0-3 F 45
3 ALFA ROMEO 0-3 P 268
4 ALFA ROMEO 0-3 PRS 21
5 ALFA ROMEO 3-5 ABA NaN
6 ALFA ROMEO 3-5 ABR NaN
7 ALFA ROMEO 3-5 F 159
8 ALFA ROMEO 3-5 P 720
And the end result should be something like this: 最终结果应该是这样的:
Make age Test Result Number of Cases Total Cases by Age
0 ALFA ROMEO 0-3 ABA 1 335
1 ALFA ROMEO 0-3 ABR NaN 335
2 ALFA ROMEO 0-3 F 45 335
3 ALFA ROMEO 0-3 P 268 335
4 ALFA ROMEO 0-3 PRS 21 335
5 ALFA ROMEO 3-5 ABA NaN 879
6 ALFA ROMEO 3-5 ABR NaN 879
7 ALFA ROMEO 3-5 F 159 879
8 ALFA ROMEO 3-5 P 720 879
And so on for makes and ages 依年龄和年龄划分,依此类推
Any help will be greatly appreciated 任何帮助将不胜感激
You could do a groupby
- sum
, followed by a left- merge
: 您可以执行
groupby
- sum
,然后进行左merge
:
pd.merge(
df,
df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
columns={'Number of Cases': 'Total Cases by Age'}),
how='left')
Example 例
Suppose you start with 假设您从
df = pd.DataFrame({
'Make': ['ALPHA ROMEO'] * 3,
'age': ['0-3', '0-3', '3-5'],
'Number of Cases': [1, 10, 2]
})
>>> df
Make Number of Cases age
0 ALPHA ROMEO 1 0-3
1 ALPHA ROMEO 10 0-3
2 ALPHA ROMEO 2 3-5
Then the groupby
- sum
gives: 然后
groupby
- sum
给出:
>>> df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
columns={'Number of Cases': 'Total Cases by Age'})
age Total Cases by Age
0 0-3 11
1 3-5 2
And the combination gives: 组合给出:
>>> pd.merge(
df,
df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
columns={'Number of Cases': 'Total Cases by Age'}),
how='left')
Make Number of Cases age Total Cases by Age
0 ALPHA ROMEO 1 0-3 11
1 ALPHA ROMEO 10 0-3 11
2 ALPHA ROMEO 2 3-5 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.