Group a dataframe and select one cell among them based on a condition

Question

My dataset look like this Tr, Date, Time AV81312,20161014,121000 AV81312,20161014,160221 AV85012,20170422,150858 AV85012,20161108,11137 AV86157,20170426,45747 AV86157,20170426,45744 AV86157,20160813,134312

I need to select only one item from each Tr having latest record ie latest record having date and and time higher

Required output is Tr, Date, Time AV81312,20161014,160221 AV85012,20170422,150858 AV86157,20170426,45747

My code is

df2 = read_csv("sample.csv") df2 = df2.values x = [] for i in df2: for j in df2: if i[2] == j[2]: if i[3] >= j[3]: x.append(i) It wasn't working as expected

Answer 1

Use -

df['Date_Time'] = pd.to_datetime(df['Date'].astype(str).str.cat(df['Time'].astype(str)),format='%Y%m%d%H%M%S')
df.loc[df.groupby('Tr')['Date_Time'].idxmax()].drop('Date_Time', axis=1)

Output

        Tr      Date    Time
0  AV81312  20161014  160221
1  AV85012  20170422  150858
2  AV86157  20170426   45747

Group a dataframe and select one cell among them based on a condition

Question

1 answers

solution1
1 ACCPTED 2019-01-16 13:31:33

Group a dataframe and select one cell among them based on a condition

Question

1 answers

solution1 1 ACCPTED 2019-01-16 13:31:33

solution1
1 ACCPTED 2019-01-16 13:31:33