简体   繁体   English

通过两列将DataFrame上的值映射

[英]Map values on DataFrame by two columns

Suppose I am trying to map total_cases_age onto cases_by_age where dataframes are: 假设我试图将total_cases_age映射到case_by_age,其中数据帧为:

 results_grouped_age = results_grouped[['Make', 'age', 'Test Result', 'Number of Cases']].copy()
    cases_by_age = results_grouped_age[['Make','age','Test Result','Number of Cases']].groupby(['Make','age','Test Result']).sum().reset_index()
    total_cases_age = cases_by_age.groupby(['Make','age'])['Number of Cases'].sum()

However whereas I would normally do: 但是,尽管我通常会这样做:

cases_by_age['Total Cases'] = cases_by_age['age'].map(total_cases_age)

Indices of total_cases_age is actually a combination of 'make and age' and this is actually what I want to do. total_cases_age的指标实际上是“年龄”的组合,这实际上是我想要做的。 To easier understand my problem, suppose I have table cases_by_age" 为了更容易理解我的问题,假设我有一个表case_by_age“

        Make      age     Test Result     Number of Cases
0    ALFA ROMEO   0-3         ABA                1
1    ALFA ROMEO   0-3         ABR              NaN
2    ALFA ROMEO   0-3           F               45
3    ALFA ROMEO   0-3           P              268
4    ALFA ROMEO   0-3         PRS               21
5    ALFA ROMEO   3-5         ABA              NaN
6    ALFA ROMEO   3-5         ABR              NaN
7    ALFA ROMEO   3-5           F              159
8    ALFA ROMEO   3-5           P              720

And the end result should be something like this: 最终结果应该是这样的:

        Make      age     Test Result     Number of Cases     Total Cases by Age
0    ALFA ROMEO   0-3         ABA                1                 335
1    ALFA ROMEO   0-3         ABR              NaN                 335 
2    ALFA ROMEO   0-3           F               45                 335
3    ALFA ROMEO   0-3           P              268                 335
4    ALFA ROMEO   0-3         PRS               21                 335
5    ALFA ROMEO   3-5         ABA              NaN                 879
6    ALFA ROMEO   3-5         ABR              NaN                 879
7    ALFA ROMEO   3-5           F              159                 879
8    ALFA ROMEO   3-5           P              720                 879

And so on for makes and ages 依年龄和年龄划分,依此类推

Any help will be greatly appreciated 任何帮助将不胜感激

You could do a groupby - sum , followed by a left- merge : 您可以执行groupby - sum ,然后进行左merge

pd.merge(
    df,
    df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
        columns={'Number of Cases': 'Total Cases by Age'}),
    how='left')

Example

Suppose you start with 假设您从

df = pd.DataFrame({
    'Make': ['ALPHA ROMEO'] * 3,
    'age': ['0-3', '0-3', '3-5'],
    'Number of Cases': [1, 10, 2]
    })
>>> df
    Make    Number of Cases age
0   ALPHA ROMEO 1   0-3
1   ALPHA ROMEO 10  0-3
2   ALPHA ROMEO 2   3-5

Then the groupby - sum gives: 然后groupby - sum给出:

>>> df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
    columns={'Number of Cases': 'Total Cases by Age'})
    age Total Cases by Age
0   0-3 11
1   3-5 2

And the combination gives: 组合给出:

>>> pd.merge(
    df,
    df['Number of Cases'].groupby(df['age']).sum().reset_index().rename(
        columns={'Number of Cases': 'Total Cases by Age'}),
    how='left')
    Make    Number of Cases age Total Cases by Age
0   ALPHA ROMEO 1   0-3 11
1   ALPHA ROMEO 10  0-3 11
2   ALPHA ROMEO 2   3-5 2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM