简体   繁体   中英

Pandas Groupby and sorting

I am not finding a way to organize the following dataframe in Pandas to show all information I need. I have the following dataframe:


    Fecha   ID  Nombre  Doc Doc2    Prod    Cantidad

0   2021-03-06 00:00:00 1   Lolo    123 1   1564    1
1   2021-03-06 00:00:00 1   Lolo    123 1   15665   1
2   2021-03-06 00:00:00 1   Lolo    123 1   1674    1
3   2021-03-06 00:00:00 2   Momo    125 1   1568    1
4   2021-03-06 00:00:00 2   Momo    125 1   1568    1
5   2021-03-06 00:00:00 3   PePe    136 2   1568    1
6   2021-03-06 00:00:00 3   PePe    136 2   1678    1
7   2021-03-06 00:00:00 4   Lolo    123 1   1674    1
8   2021-03-06 00:00:00 5   Coco    125 2   1568    1
9   2021-03-07 00:00:00 6   Lolo    123 1   15665   1

Now, I need to add the Cantidad of` `` Prod for the same date for each person, in example Lolo, she has rows 0, 1, 2 and 4 on the same day. The way I found to do it was with for the same date for each person, in example Lolo, she has rows 0, 1, 2 and 4 on the same day. The way I found to do it was with groupby after concatenating the Date + Doc + Doc2``` (all strings), which was the only way I found be able to group and separate the same person by different date. The code is the following:

df['Concat'] = df['Doc'] + df['Doc2'] + df['Fecha'].str[:-9].str.replace('-','')

gb = df.groupby(['Concat', 'Fecha', 'Nombre', 'Doc', 'Doc2', 'Prod'],
                as_index=False)[['Cantidad']].sum()

and I get this result:

Concat  Fecha   Nombre  Doc Doc2    Prod    Cantidad
0   123120210306    2021-03-06 00:00:00 Lolo    123 1   1564    1
1   123120210306    2021-03-06 00:00:00 Lolo    123 1   1674    2
2   123120210306    2021-03-06 00:00:00 Lolo    123 1   15665   1
3   123120210307    2021-03-07 00:00:00 Lolo    123 1   15665   1
4   125120210306    2021-03-06 00:00:00 Momo    125 1   1568    2
5   125220210306    2021-03-06 00:00:00 Coco    125 2   1568    1
6   136220210306    2021-03-06 00:00:00 PePe    136 2   1568    1
7   136220210306    2021-03-06 00:00:00 PePe    136 2   1678    1

The grouping is correct, the problem is when I want to put the `` `ID``` in the dataframe and select the minimum ID of the date, which in this case for" Lolo "is 1 (it has 1 and 4 that day in the example).

Every time I put the ID in groupby , it stops grouping me by quantity.

Could someone guide me on how to get the solution? The result should be like this:

Concat  Fecha   ID  Nombre  Doc Doc2    Prod    Cantidad
0   123120210306    2021-03-06 00:00:00 1   Lolo    123 1   1564    1
1   123120210306    2021-03-06 00:00:00 1   Lolo    123 1   1674    2
2   123120210306    2021-03-06 00:00:00 1   Lolo    123 1   15665   1
3   123120210307    2021-03-07 00:00:00 6   Lolo    123 1   15665   1
4   125120210306    2021-03-06 00:00:00 2   Momo    125 1   1568    2
5   125220210306    2021-03-06 00:00:00 5   Coco    125 2   1568    1
6   136220210306    2021-03-06 00:00:00 3   PePe    136 2   1568    1
7   136220210306    2021-03-06 00:00:00 3   PePe    136 2   1678    1

Thanks.

Looks like you want the minimum ID for each date, and I think you would then want to use that minimum ID for every instance of that date. If so, do a separate groupby to get just that data, then merge on date. See this toy example:

df = pd.DataFrame({'date': ["2021-03-06", "2021-03-06", "2021-03-07", "2021-03-07"], 'ID': [1, 2, 3, 4]})

df_min_id = df.groupby('date', as_index=False)['ID'].min()

# you may want to rename the ID column to flag that it is the min
df_min_id = df_min_id.rename(columns={'ID': 'min_ID'})

df = df.merge(df_min_id, on='date', how='left')
# in your case, I think you want:
gb = gb.merge(df_min_id, on='date', how='left')

print(df)
#          date  ID  min_ID
# 0  2021-03-06   1       1
# 1  2021-03-06   2       1
# 2  2021-03-07   3       3
# 3  2021-03-07   4       3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM