简体   繁体   English

在Python中找到第n个范围的中位数

[英]Find median in nth range in Python

I am trying to find value of every median in my dataset for every 15 days . 我正在尝试查找每15天数据集中每个中位数的值 Dataset has three columns - index, value and date. 数据集具有三列-索引,值和日期。

This is for evaluation of this median according to some conditions. 这是根据某些条件评估此中位数的方法。 Each of 15 days will get new value according to conditions. 根据条件,每15天将获得新的价值。 I've tried several approaches (mostly python comprehension) but I am still a beginner to solve it properly. 我尝试了几种方法(主要是python理解),但我仍然是初学者,可以正确地解决它。

    value   date        index
14  13065   1983-07-15  14
15  13065   1983-07-16  15
16  13065   1983-07-17  16
17  13065   1983-07-18  17
18  13065   1983-07-19  18
19  13065   1983-07-20  19
20  13065   1983-07-21  20
21  13065   1983-07-22  21
22  13065   1983-07-23  22
23  .....    .........  .. 

medians = [dataset['value'].median() for range(0, len(dataset['index']), 15) in dataset['value']]   

I am expecting to return medians from the dataframe to a new variable. 我期望将数据框中的中值返回到新变量。

syntaxError: can't assign to function call

Assuming you have data in the below format: 假设您具有以下格式的数据:

test = pd.DataFrame({'date': pd.date_range(start = '2016/02/12', periods = 1000, freq='1D'),
                                         'value': np.random.randint(1,1000,1000)})
test.head()

    date       value
0   2016-02-12  243
1   2016-02-13  313
2   2016-02-14  457
3   2016-02-15  236
4   2016-02-16  893

If you want to median for every 15 days then use pd.Grouper and groupby date: 如果要每15天进行一次中值,请使用pd.Groupergroupby date:

test.groupby(pd.Grouper(freq='15D', key='date')).median().reset_index()

date        Value
2016-02-12  457.0
2016-02-27  733.0
2016-03-13  688.0
2016-03-28  504.0
2016-04-12  591.0

Note that while using pd.Grouper, your date column should be of type datetime. 请注意,在使用pd.Grouper时,您的日期列应为datetime类型。 If it's not, convert using: 如果不是,请使用以下命令进行转换:

test['date'] = pd.to_datetime(test['date'])

Use DataFrame.resample with median : 使用DataFrame.resamplemedian

#if encessary convert to datetimes
dataset['date'] = pd.to_datetime(dataset['date'])

dataset = dataset.resample('15D', on='date')['value'].median().reset_index()
print (dataset)
        date  value
0 1983-07-15  13065

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM