[英]How do I get the values of a particular column in a pandas dataframe based on a date condition on another column?
I have a dataframe which looks like this:我有一个如下所示的数据框:
A B Start_Date
1 4 2003-05-22
2 6 2003-05-31
....
57 406 2018-09-08
I want to get the value which is at or after a few years from the Start_Date.我想获得从 Start_Date 开始几年或之后的值。 For instance I want to know the value of column B which will be at a date less than or equal to 10 years from the Start_Date for the corresponding value.例如,我想知道 B 列的值,该值将在距相应值的 Start_Date 小于或等于 10 年的日期。 So this will look something like this:所以这看起来像这样:
A B Start_Date D
1 4 2003-05-22 <value of B on or before (last value before) 2013-05-22>
2 6 2003-05-31 <value of B on or before (last value before) 2013-05-31>
....
57 406 2018-09-08 <value of B on or before (last value before) 2028-09-08>
When I try something like this ('Start_Date plus 10' is just another column with 10 years added to the Start_Date column)当我尝试这样的事情时('Start_Date plus 10' 只是在 Start_Date 列中添加了 10 年的另一列)
df['D']=df[df['Start Date']<=df['Start_Date plus 10']]['B'].max()
It just gives out the maximum value for column B which is understandable, however not my end objective.它只是给出了 B 列的最大值,这是可以理解的,但不是我的最终目标。 Please help with suggestions on this.请帮忙提出这方面的建议。 Please let me know if there is ambiguity in the question or if anything needs to be clarified further.请让我知道问题是否有歧义或是否需要进一步澄清。 Thank you for taking the time to read this and answer the question.感谢您花时间阅读本文并回答问题。
I am not sure if this is exactly what you need, let me know if it does not work.我不确定这是否正是您所需要的,如果它不起作用,请告诉我。 But for example if you have a DataFrame like:但是例如,如果您有一个 DataFrame,例如:
tempDF = pd.DataFrame({'dates': ['2003-05-20',
'2003-05-21',
'2003-05-22',
'2003-05-23',
'2003-05-24',
'2003-05-25']})
and you define your dates like:你定义你的日期,如:
min_date = '2003-05-21'
max_date = '2003-05-23'
you have different options.你有不同的选择。 You can either use somehing getting firstly all entries over a specific date and then again use a subsample of that by filtering all unter a specific date.您可以使用 somehing 首先获取特定日期的所有条目,然后通过在特定日期过滤所有条目再次使用该子样本。
filteredDF = tempDF[tempDF['dates']>=min_date][tempDF['dates'] <=max_date]
or you could use the 'query' function (like it is explained here here或者您可以使用“查询”功能(就像这里解释的那样
filteredDF =tempDF.query('dates >= @min_date').query('dates <= @max_date')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.