[英]Pandas filter dataframe based on date range and another column
我有一个名为df1
的pandas
数据df1
并且想根据数据grp_id
df2
条件过滤该数据grp_id
,对于特定的grp_id
,我只希望从df2
year
列开始的日期直到最近的年份(2016),如图所示。 df3
。 这只是我的数据的一个子集,其中我至少有10个唯一的grp_id
到具有不同起始年的子集。
DF1
db_id cert_status grp_id year cap prov
130 IX-011 not-certified member SD 2004-01-01 30.0 KB
131 IX-011 not-certified member SD 2005-01-01 30.0 KB
132 IX-011 not-certified member SD 2006-01-01 30.0 KB
133 IX-011 not-certified member SD 2007-01-01 30.0 KB
134 IX-011 not-certified member SD 2008-01-01 30.0 KB
135 IX-011 not-certified member SD 2009-01-01 30.0 KB
136 IX-011 not-certified member SD 2010-01-01 30.0 KB
137 IX-011 not-certified member SD 2011-01-01 30.0 KB
138 IX-011 not-certified member SD 2012-01-01 30.0 KB
139 IX-011 not-certified member SD 2013-01-01 30.0 KB
140 IX-011 not-certified member SD 2014-01-01 30.0 KB
141 IX-011 not-certified member SD 2015-01-01 30.0 KB
142 IX-011 not-certified member SD 2016-01-01 30.0 KB
208 IX-017 not-certified member CG 2004-01-01 30.0 KB
209 IX-017 not-certified member CG 2005-01-01 30.0 KB
210 IX-017 not-certified member CG 2006-01-01 30.0 KB
211 IX-017 not-certified member CG 2007-01-01 30.0 KB
212 IX-017 not-certified member CG 2008-01-01 30.0 KB
213 IX-017 not-certified member CG 2009-01-01 30.0 KB
214 IX-017 not-certified member CG 2010-01-01 30.0 KB
215 IX-017 not-certified member CG 2011-01-01 30.0 KB
216 IX-017 not-certified member CG 2012-01-01 30.0 KB
217 IX-017 not-certified member CG 2013-01-01 80.0 KB
218 IX-017 not-certified member CG 2014-01-01 30.0 KB
219 IX-017 not-certified member CG 2015-01-01 30.0 KB
220 IX-017 not-certified member CG 2016-01-01 30.0 KB
DF2
grp_id member year
4 SD Y 2007-01-01
6 CG Y 2011-01-01
DF3
db_id cert_status grp_id year cap prov
133 IX-011 not-certified member SD 2007-01-01 30.0 KB
134 IX-011 not-certified member SD 2008-01-01 30.0 KB
135 IX-011 not-certified member SD 2009-01-01 30.0 KB
136 IX-011 not-certified member SD 2010-01-01 30.0 KB
137 IX-011 not-certified member SD 2011-01-01 30.0 KB
138 IX-011 not-certified member SD 2012-01-01 30.0 KB
139 IX-011 not-certified member SD 2013-01-01 30.0 KB
140 IX-011 not-certified member SD 2014-01-01 30.0 KB
141 IX-011 not-certified member SD 2015-01-01 30.0 KB
142 IX-011 not-certified member SD 2016-01-01 30.0 KB
215 IX-017 not-certified member CG 2011-01-01 30.0 KB
216 IX-017 not-certified member CG 2012-01-01 30.0 KB
217 IX-017 not-certified member CG 2013-01-01 80.0 KB
218 IX-017 not-certified member CG 2014-01-01 30.0 KB
219 IX-017 not-certified member CG 2015-01-01 30.0 KB
220 IX-017 not-certified member CG 2016-01-01 30.0 KB
这样做最简单,最快的方法是什么?
尝试使用带有query
merge
来过滤:
df1.merge(df2, on = ['grp_id'], suffixes=('','_2'), right_index=True)\
.query('year >= year_2')[df1.columns]
输出:
db_id cert_status grp_id year cap prov
133 IX-011 not-certified member SD 2007-01-01 30.0 KB
134 IX-011 not-certified member SD 2008-01-01 30.0 KB
135 IX-011 not-certified member SD 2009-01-01 30.0 KB
136 IX-011 not-certified member SD 2010-01-01 30.0 KB
137 IX-011 not-certified member SD 2011-01-01 30.0 KB
138 IX-011 not-certified member SD 2012-01-01 30.0 KB
139 IX-011 not-certified member SD 2013-01-01 30.0 KB
140 IX-011 not-certified member SD 2014-01-01 30.0 KB
141 IX-011 not-certified member SD 2015-01-01 30.0 KB
142 IX-011 not-certified member SD 2016-01-01 30.0 KB
215 IX-017 not-certified member CG 2011-01-01 30.0 KB
216 IX-017 not-certified member CG 2012-01-01 30.0 KB
217 IX-017 not-certified member CG 2013-01-01 80.0 KB
218 IX-017 not-certified member CG 2014-01-01 30.0 KB
219 IX-017 not-certified member CG 2015-01-01 30.0 KB
220 IX-017 not-certified member CG 2016-01-01 30.0 KB
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.