[英]Pandas quantile function for dates?
I have a dataframe of donation amounts and dates. 我有一个捐赠金额和日期的数据框。 I would like to see how long it took a certain proportion of the donations to come in (at what point did we have 25% of donations?, 75% ?).
我想看看捐赠的一定比例需要多长时间(在什么时候我们有25%的捐赠?,75%的捐赠)。 It looked like the Pandas quantile function would do what I want.
看起来Pandas分位数功能可以满足我的要求。 However it seems to only want numbers, not dates.
但是,它似乎只需要数字,而不是日期。 Is there a function that would do the same with dates ?
是否存在与日期相同的功能?
http://pandas.pydata.org/pandas-docs/dev/generated/pandas.core.groupby.DataFrameGroupBy.quantile.html#pandas.core.groupby.DataFrameGroupBy.quantile http://pandas.pydata.org/pandas-docs/dev/generated/pandas.core.groupby.DataFrameGroupBy.quantile.html#pandas.core.groupby.DataFrameGroupBy.quantile
就像Evert所说的那样,您可以将其临时转换为int 64计算,然后转换回datetime
YOUR_DATAFRAME.YOUR_DATE.astype('int64').quantile([.25,.5,.75]).astype('datetime64[ns]')
I had the same problem, in my case to split a timeseries for a machine learning problem. 我遇到了同样的问题,在我的情况下,为机器学习问题拆分了一个时间序列。
I wrote the following based on the above answers by evert and steboc , and added the case where the dates might be written as strings: 我写了基于上述回答以下埃弗特和steboc ,并增加其中的日期可能会被写为字符串的情况:
def get_split_date(df, date_column, quantile):
""" Get the date on which to split a dataframe for timeseries splitting """
# 1. convert date_column to datetime (useful in case it is a string)
# 2. convert into int (for sorting)
# 3. get the quantile
# 4. get the corresponding date
# 5. return, pray that it works
quantile_date = pd.to_datetime(df[date_column], coerce = True).astype('int64').quantile(q=quantile).astype('datetime64[ns]')
return quantile_date
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.