Group by in Python Pandas (Multiple columns join with , )

Question

I have a table in CSV just like this:

And i need to group it just like this:

In all my CONCURSO only CIDADE and UF change.

i'm trying this code but it doesn't work.

Can you guys help me, please?

import...

    new_df = pd.read_csv(fr'C:\Users\anton\Desktop\Anon\data\swamp\{date}\nao_tratado.csv')
    new_df = new_df.groupby(by=['Concurso'], as_index=False).agg(','.join)
    new_df = pd.concat([new_df]).to_csv(fr'C:\Users\anton\Desktop\Anon\data\lake\{date}\tratado.csv', index=False)
    print('We are done.')

Answer 1

The agg() method of Pandas can take a dictionary for the func parameter. This dict maps the column and its aggregation function.

I guess you can then do the following:

columns_to_aggregate = ["Cidade", "UF"]
columns_for_groupby = ["Concurso"]
columns = list(set(new_df.columns).difference(columns_for_groupby))
aggregation_func = {c: (lambda x: ", ".join(map(str, x))) if c in columns_to_aggregate else "min" for c in columns}
new_df = new_df.groupby(by=columns_for_groupby, as_index=False).agg(aggregation_func)
new_df.to_csv(fr'C:\Users\anton\Desktop\Anon\data\lake\{date}\tratado.csv', index=False)

Tell me if it does not work:)

Group by in Python Pandas (Multiple columns join with , )

Question

1 answers

solution1
0 2021-03-17 00:48:27

Group by in Python Pandas (Multiple columns join with , )

Question

1 answers

solution1 0 2021-03-17 00:48:27

solution1
0 2021-03-17 00:48:27