简体   繁体   中英

How do I retain the column name used in my group by with Pandas

I have two data frames. I would like to use group by on the second data frame and then merge the two together on the Company Name column. The issue is that with my group by statement I loose the Company Name column.

import pandas as pd

df1 = pd.DataFrame(
    {
        'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
        'Location': ['Somewhere','Somewhere','Somewhere','Somewhere','Somewhere','Somewhere'],
    }
)

df = pd.DataFrame(
    {
        'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
        'Sales': [12345,12345,12345,12345,12345,12345],
        'Company Type': ['Software','Software','Software','Software','Software','Software']
    }
)
df = df.groupby(['Company Name']).sum()

pd.merge(df1,df,how="inner",on="Company Name")

I get an error message when merging due to df not having a Company Name column to perform the join.

Replace this line:

df = df.groupby(['Company Name']).sum()

With:

df = df.groupby('Company Name', as_index=False).sum()

Then your code will work as expected, and return:

  Company Name   Location  Sales
0       Google  Somewhere  24690
1       Google  Somewhere  24690
2    Microsoft  Somewhere  24690
3    Microsoft  Somewhere  24690
4       Amazon  Somewhere  24690
5       Amazon  Somewhere  24690

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM