简体繁体中英

Pandas Groupby using different agg methods for different columns

原文 2019-05-03 14:47:05 9 1 python/ pandas/ group-by

Here is the scenario:

I have a large ordered dataset with 314 columns and over 300.000 lines for a ML problem.
I wanna group by the dataset by column X (suppliers).
One column is a datetime type, some columns are numeric by nature and others were one-hot encoded from some categorical columns.

Desired output:

I wanna groupby column X, and aggregate the numeric columns by "mean", some columns by "last", and the one-hot-encoded ones by "sum". All on the same agg method.

Since we are talking about a 314 columns dataset I can't just create a dict containing each column.

df_train.groupby('Supplier').agg({<some columns> : 'last', <some columns>: 'sum', <some columns>: 'mean' })

PS: I ordered the columns using the sequence that I wanna apply the different aggregations.

1 answers

You could use select_dtypes to get the columns that are numeric, and use these in a dictionary comprehension.

numeric_cols = df_train.select_dtypes('numeric').columns

agg_dict = {c: 'sum' if c in numeric_cols else 'last' for c in df_train.columns}

grouped = df_train.groupby('Supplier').agg(agg_dict)

With regards to your one-hot encoded columns, you will need to provide more information as to how they might be identified.

Aggregate Multiple columns with different agg functions in Pandas using Crosstab

groupby and agg with multiple columns pandas

pandas gives different numerical results when using apply and agg for a GroupBy object

Groupby 2 different columns Python Pandas

Keep One Column but Using Other Columns in Pandas Groupby and Agg

How to use pandas to agg data with different condition for different columns?

groupby in pandas with different functions for different columns

Pandas groupby() and agg() method confusion on columns

pandas groupby with agg not working on multiple columns

Different fill methods for different columns in pandas

暂无

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Aggregate Multiple columns with different agg functions in Pandas using Crosstab groupby and agg with multiple columns pandas pandas gives different numerical results when using apply and agg for a GroupBy object Groupby 2 different columns Python Pandas Keep One Column but Using Other Columns in Pandas Groupby and Agg How to use pandas to agg data with different condition for different columns? groupby in pandas with different functions for different columns Pandas groupby() and agg() method confusion on columns pandas groupby with agg not working on multiple columns Different fill methods for different columns in pandas

Related Tags

粤ICP备18138465号 © 2020-2024 STACKOOM.COM