ID | Create Date | Last Modify Date |
---|---|---|
1 | 03/31/2021 8:56 | 03/31/2021 09:46 |
1 | 03/31/2021 5:56 | 03/31/2021 09:48 |
2 | 03/31/2021 0:23 | 03/31/2021 09:47 |
2 | 03/31/2021 6:56 | 03/31/2021 09:46 |
3 | 03/31/2021 7:32 | 03/31/2021 09:46 |
3 | 03/31/2021 8:45 | 03/31/2021 09:46 |
Hello,
For the above table I need to comment oldest Create Date for each ID as "Minimal" .
import pandas as pd
inputFolder = os.getcwd()
filename = filedialog.askopenfilename(title="Select file:", filetypes=(("xlsx files", ".xlsx"), ("all files", "*.*")), initialdir = inputFolder)
df = pd.read_excel(filename, index_col=None, header=0)
df.loc[(df.groupby(['BB Global ID']).agg({'Create Date': min})), 'Comment'] = 'Minimal'
print(df)
I tried to do it with pandas df.loc function but I'm having below error.
KeyError: "None of [Index([('C', 'r', 'e', 'a', 't', 'e', ' ', 'D', 'a', 't', 'e')], dtype='object')] are in the [index]"
Below is final result what I want to achieve:
ID | Create Date | Last Modify Date | Comment |
---|---|---|---|
1 | 03/31/2021 8:56 | 03/31/2021 09:46 | |
1 | 03/31/2021 5:56 | 03/31/2021 09:48 | Minimal |
2 | 03/31/2021 0:23 | 03/31/2021 09:47 | Minimal |
2 | 03/31/2021 6:56 | 03/31/2021 09:46 | |
3 | 03/31/2021 7:32 | 03/31/2021 09:46 | Minimal |
3 | 03/31/2021 8:45 | 03/31/2021 09:46 |
Use GroupBy.transform
for repeat aggregate values, so possible compare by original column:
mask = df.groupby(['BB Global ID'])['Create Date'].transform(min).eq(df['Create Date'])
df.loc[mask, 'Comment'] = 'Minimal'
Or:
df['Comment'] = np.where(mask, 'Minimal', '')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.