I have a dataframe like this:
account date
A 0812
A 0812
A 0812
A 0823
A 0823
B 0723
B 0730
B 0730
B 0801
B 0801
B 0801
I want to get the 'date' value for the first time the value changes per account. So the output I'm looking for is this:
account date
A 0823
B 0730
I have tried to do a dense rank groupby function and filter by rank equaling 1.
df.groupby('account')['date'].rank(method='dense') but the output keeps the same rank for the same value, which does not work. 'first' and 'last' ranks don't seem to be working either.
I believe you need DataFrame.drop_duplicates
first and then get second value per group, by GroupBy.cumcount
:
df1 = df.drop_duplicates(['account','date'])
df1 = df1[df1.groupby('account').cumcount().eq(1)]
print (df1)
account date
3 A 823
6 B 730
Or by GroupBy.nth
:
df1 = df.drop_duplicates(['account','date'])
df1 = df1.groupby('account', as_index=False).nth(1)
print (df1)
account date
3 A 823
6 B 730
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.