简体   繁体   English

Pandas - 查找列中第一次更改值的时间?

[英]Pandas- Finding the first time a value changes in a column?

I have a dataframe like this:我有一个像这样的 dataframe:

account    date
  A        0812
  A        0812
  A        0812
  A        0823
  A        0823
  B        0723
  B        0730
  B        0730
  B        0801
  B        0801
  B        0801

I want to get the 'date' value for the first time the value changes per account.我想在每个帐户的值第一次更改时获取“日期”值。 So the output I'm looking for is this:所以我正在寻找的 output 是这样的:

account   date
  A       0823
  B       0730

I have tried to do a dense rank groupby function and filter by rank equaling 1.我试图通过 function 做一个密集等级组并按等级等于 1 过滤。

df.groupby('account')['date'].rank(method='dense') but the output keeps the same rank for the same value, which does not work. df.groupby('account')['date'].rank(method='dense')但 output 为相同的值保持相同的排名,这不起作用。 'first' and 'last' ranks don't seem to be working either. “第一”和“最后”的排名似乎也不起作用。

I believe you need DataFrame.drop_duplicates first and then get second value per group, by GroupBy.cumcount :我相信您首先需要DataFrame.drop_duplicates ,然后通过GroupBy.cumcount获得每个组的第二个值:

df1 = df.drop_duplicates(['account','date'])

df1 = df1[df1.groupby('account').cumcount().eq(1)]
print (df1)
  account  date
3       A   823
6       B   730

Or by GroupBy.nth :或通过GroupBy.nth

df1 = df.drop_duplicates(['account','date'])

df1 = df1.groupby('account', as_index=False).nth(1)
print (df1)
  account  date
3       A   823
6       B   730

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM