I am not finding a way to organize the following dataframe in Pandas to show all information I need. I have the following dataframe:
Fecha ID Nombre Doc Doc2 Prod Cantidad
0 2021-03-06 00:00:00 1 Lolo 123 1 1564 1
1 2021-03-06 00:00:00 1 Lolo 123 1 15665 1
2 2021-03-06 00:00:00 1 Lolo 123 1 1674 1
3 2021-03-06 00:00:00 2 Momo 125 1 1568 1
4 2021-03-06 00:00:00 2 Momo 125 1 1568 1
5 2021-03-06 00:00:00 3 PePe 136 2 1568 1
6 2021-03-06 00:00:00 3 PePe 136 2 1678 1
7 2021-03-06 00:00:00 4 Lolo 123 1 1674 1
8 2021-03-06 00:00:00 5 Coco 125 2 1568 1
9 2021-03-07 00:00:00 6 Lolo 123 1 15665 1
Now, I need to add the Cantidad
of` `` Prod for the same date for each person, in example Lolo, she has rows 0, 1, 2 and 4 on the same day. The way I found to do it was with
for the same date for each person, in example Lolo, she has rows 0, 1, 2 and 4 on the same day. The way I found to do it was with
groupby after concatenating the
Date +
Doc +
Doc2``` (all strings), which was the only way I found be able to group and separate the same person by different date. The code is the following:
df['Concat'] = df['Doc'] + df['Doc2'] + df['Fecha'].str[:-9].str.replace('-','')
gb = df.groupby(['Concat', 'Fecha', 'Nombre', 'Doc', 'Doc2', 'Prod'],
as_index=False)[['Cantidad']].sum()
and I get this result:
Concat Fecha Nombre Doc Doc2 Prod Cantidad
0 123120210306 2021-03-06 00:00:00 Lolo 123 1 1564 1
1 123120210306 2021-03-06 00:00:00 Lolo 123 1 1674 2
2 123120210306 2021-03-06 00:00:00 Lolo 123 1 15665 1
3 123120210307 2021-03-07 00:00:00 Lolo 123 1 15665 1
4 125120210306 2021-03-06 00:00:00 Momo 125 1 1568 2
5 125220210306 2021-03-06 00:00:00 Coco 125 2 1568 1
6 136220210306 2021-03-06 00:00:00 PePe 136 2 1568 1
7 136220210306 2021-03-06 00:00:00 PePe 136 2 1678 1
The grouping is correct, the problem is when I want to put the `` `ID``` in the dataframe and select the minimum ID of the date, which in this case for" Lolo "is 1 (it has 1 and 4 that day in the example).
Every time I put the ID
in groupby
, it stops grouping me by quantity.
Could someone guide me on how to get the solution? The result should be like this:
Concat Fecha ID Nombre Doc Doc2 Prod Cantidad
0 123120210306 2021-03-06 00:00:00 1 Lolo 123 1 1564 1
1 123120210306 2021-03-06 00:00:00 1 Lolo 123 1 1674 2
2 123120210306 2021-03-06 00:00:00 1 Lolo 123 1 15665 1
3 123120210307 2021-03-07 00:00:00 6 Lolo 123 1 15665 1
4 125120210306 2021-03-06 00:00:00 2 Momo 125 1 1568 2
5 125220210306 2021-03-06 00:00:00 5 Coco 125 2 1568 1
6 136220210306 2021-03-06 00:00:00 3 PePe 136 2 1568 1
7 136220210306 2021-03-06 00:00:00 3 PePe 136 2 1678 1
Thanks.
Looks like you want the minimum ID for each date, and I think you would then want to use that minimum ID for every instance of that date. If so, do a separate groupby to get just that data, then merge on date. See this toy example:
df = pd.DataFrame({'date': ["2021-03-06", "2021-03-06", "2021-03-07", "2021-03-07"], 'ID': [1, 2, 3, 4]})
df_min_id = df.groupby('date', as_index=False)['ID'].min()
# you may want to rename the ID column to flag that it is the min
df_min_id = df_min_id.rename(columns={'ID': 'min_ID'})
df = df.merge(df_min_id, on='date', how='left')
# in your case, I think you want:
gb = gb.merge(df_min_id, on='date', how='left')
print(df)
# date ID min_ID
# 0 2021-03-06 1 1
# 1 2021-03-06 2 1
# 2 2021-03-07 3 3
# 3 2021-03-07 4 3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.