[英]Retrieve past n records of panda dataframe based on column value
我下面有一个熊猫数据框df
;
pay pay2 date
0 209.070007 208.000000 2018-08-06
1 207.110001 209.320007 2018-08-07
2 207.250000 206.050003 2018-08-08
3 208.880005 207.279999 2018-08-09
给定一个last_date
,我想从last_date
开始检索过去的n
行,包括last_date
本身的行。
例如,给定last_date = '2018-08-08'
和n=3
。 结果数据框应如下所示;
pay pay2 date
0 209.070007 208.000000 2018-08-06
1 207.110001 209.320007 2018-08-07
2 207.250000 206.050003 2018-08-08
采用
In [285]: s = df.date.eq(last_date)
In [286]: loc = s.index[s][-1] # or s.idxmax()
In [287]: df.loc[loc-3:loc]
Out[287]:
pay pay2 date
0 209.070007 208.000000 2018-08-06
1 207.110001 209.320007 2018-08-07
2 207.250000 206.050003 2018-08-08
细节
In [288]: s
Out[288]:
0 False
1 False
2 True
3 False
Name: date, dtype: bool
In [289]: loc
Out[289]: 2
我会做的短一些。 首先将字符串转换为日期时间。
last_date = '2018-08-08'
last_date = pd.to_datetime(last_date)
n = 2
df.loc[(df['date'] <= last_date)].loc[:n-1]
# pay pay2 date
# 0 209.070007 208.000000 2018-08-06
# 1 207.110001 209.320007 2018-08-07
你需要:
df = pd.DataFrame({'pay':['209.07','207.110001','207.250000','208.880005'],
'pay2':['208','209.320007','206.050003','207.279999'],
'date':['2018-08-06','2018-08-07','2018-08-08','2018-08-09']})
last_date = pd.to_datetime('2018-08-08')
n= 3
df['date'] =pd.to_datetime(df['date'])
df_new = df[df['date']<=last_date].sort_values("date", ascending=False)
df_new = df_new[:n].sort_values("date", ascending=True)
print(df_new)
输出:
pay pay2 date
0 209.07 208 2018-08-06
1 207.110001 209.320007 2018-08-07
2 207.250000 206.050003 2018-08-08
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.