How do I retain the column name used in my group by with Pandas

Question

I have two data frames. I would like to use group by on the second data frame and then merge the two together on the Company Name column. The issue is that with my group by statement I loose the Company Name column.

import pandas as pd

df1 = pd.DataFrame(
    {
        'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
        'Location': ['Somewhere','Somewhere','Somewhere','Somewhere','Somewhere','Somewhere'],
    }
)

df = pd.DataFrame(
    {
        'Company Name': ['Google','Google','Microsoft','Microsoft','Amazon','Amazon'],
        'Sales': [12345,12345,12345,12345,12345,12345],
        'Company Type': ['Software','Software','Software','Software','Software','Software']
    }
)
df = df.groupby(['Company Name']).sum()

pd.merge(df1,df,how="inner",on="Company Name")

I get an error message when merging due to df not having a Company Name column to perform the join.

Answer 1

Replace this line:

df = df.groupby(['Company Name']).sum()

With:

df = df.groupby('Company Name', as_index=False).sum()

Then your code will work as expected, and return:

  Company Name   Location  Sales
0       Google  Somewhere  24690
1       Google  Somewhere  24690
2    Microsoft  Somewhere  24690
3    Microsoft  Somewhere  24690
4       Amazon  Somewhere  24690
5       Amazon  Somewhere  24690

How do I retain the column name used in my group by with Pandas

Question

1 answers

solution1
1 ACCPTED 2019-05-31 02:15:01

How do I retain the column name used in my group by with Pandas

Question

1 answers

solution1 1 ACCPTED 2019-05-31 02:15:01

solution1
1 ACCPTED 2019-05-31 02:15:01