Grouping pandas dataframe by two columns without summarizing it

Question

I have a pandas Dataframe over the different states in America. I would like to group by the two columns year and state in order to statistically test some things eg cause of death, newborns etc. and also plot it. I can only come up with the groupby pandas function where I have to specify a statistical summary in the end such as:

import pandas as pd
df = pd.read_csv(path + 'csvfile.csv')
grouped_df = df.groupby(['Year', 'State']).mean()

However, I would like to just group by the year and state alone, but doing so with groupby I get this:

import pandas as pd
df = pd.read_csv(path + 'csvfile.csv')
grouped_df = df.groupby(['Year', 'State'])

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x0000025720134688>

How can I do this?

Answer 1

First groupby is simplifying like iterator , so is important what is after specify - aggregate function, custom function..?

Not sure what means group by the year and state alone , if need MultiIndex by 2 columns use:

grouped_df = df.set_index(['Year', 'State'])

Grouping pandas dataframe by two columns without summarizing it

Question

1 answers

solution1
1 ACCPTED 2021-12-01 08:31:45

Grouping pandas dataframe by two columns without summarizing it

Question

1 answers

solution1 1 ACCPTED 2021-12-01 08:31:45

solution1
1 ACCPTED 2021-12-01 08:31:45