简体   繁体   English

熊猫分位数功能的日期?

[英]Pandas quantile function for dates?

I have a dataframe of donation amounts and dates. 我有一个捐赠金额和日期的数据框。 I would like to see how long it took a certain proportion of the donations to come in (at what point did we have 25% of donations?, 75% ?). 我想看看捐赠的一定比例需要多长时间(在什么时候我们有25%的捐赠?,75%的捐赠)。 It looked like the Pandas quantile function would do what I want. 看起来Pandas分位数功能可以满足我的要求。 However it seems to only want numbers, not dates. 但是,它似乎只需要数字,而不是日期。 Is there a function that would do the same with dates ? 是否存在与日期相同的功能?

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.core.groupby.DataFrameGroupBy.quantile.html#pandas.core.groupby.DataFrameGroupBy.quantile http://pandas.pydata.org/pandas-docs/dev/generated/pandas.core.groupby.DataFrameGroupBy.quantile.html#pandas.core.groupby.DataFrameGroupBy.quantile

就像Evert所说的那样,您可以将其临时转换为int 64计算,然后转换回datetime

YOUR_DATAFRAME.YOUR_DATE.astype('int64').quantile([.25,.5,.75]).astype('datetime64[ns]')

I had the same problem, in my case to split a timeseries for a machine learning problem. 我遇到了同样的问题,在我的情况下,为机器学习问题拆分了一个时间序列。

I wrote the following based on the above answers by evert and steboc , and added the case where the dates might be written as strings: 我写了基于上述回答以下埃弗特steboc ,并增加其中的日期可能会被写为字符串的情况:

def get_split_date(df, date_column, quantile): 

    """ Get the date on which to split a dataframe for timeseries splitting """ 

    # 1. convert date_column to datetime (useful in case it is a string) 
    # 2. convert into int (for sorting) 
    # 3. get the quantile 
    # 4. get the corresponding date
    # 5. return, pray that it works 

    quantile_date = pd.to_datetime(df[date_column], coerce = True).astype('int64').quantile(q=quantile).astype('datetime64[ns]')

    return quantile_date

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM