繁体   English   中英

Python:每小时的最大值和最小值

[英]Python: Max & Min value on each hour

我有一个2x720的数组。 第一列是日期时间,第二列是值。 我的数据如下:-

[(datetime.datetime(2015,4,26,0,10),25.2),
(datetime.datetime(2015,4,26,0,20),25.1),
(datetime.datetime(2015,4,26,0,30),25.7),
(datetime.datetime(2015,4,26,0,40),23.2),
(datetime.datetime(2015,4,26,0,50),22.2),
(datetime.datetime(2015,4,26,0,60),29.2),
(datetime.datetime(2015,4,26,1,00),22.2),
(datetime.datetime(2015,4,26,1,10),21.2), ...]

所有数据都在同一日期。 我只想组织数据以按小时准备在烛台中的图表(仅最大,最小,不希望打开,关闭)。 我只想要这样的数据:

[(datetime.datetime(2015,4,26,0,00),max in hour 0, min in hour 0),
(datetime.datetime(2015,4,26,1,00),max in hour 1, min in hour 1),    
(datetime.datetime(2015,4,26,2,00),max in hour 2, min in hour 2), ...
(datetime.datetime(2015,4,26,23,00),max in hour 23, min in hour 23)]

我是新Python,想使用漂亮的简短脚本。 以前,我使用C ++(很久以前),我发现Python不仅仅是编程,更是艺术。 我尝试搜索答案一段时间,但找不到符合我要求的答案。 谢谢您的帮助。

由于它们都在同一天,因此请按小时分组,然后描述分组。

 import datetime
 from collections import defaultdict     

 start_of_day = datetime.datetime(2015, 4, 26)

 hour_to_values = defaultdict(list)
 for dt, value in your_list_of_values:
      hour_to_values[dt.hour].append(value)

 result = [(start_of_day + datetime.timedelta(hours=hour),
            min(values), max(values))
           for hour, values in hour_to_values.iteritems()]

以下假设列表已按日期排序

output = []
current_hour = None
current_output = None
for point in data:
    phour = point[0].hour
    pvalue = point[1]
    if phour is current_hour:
        if pvalue < current_output[1]:
            current_output[1] = pvalue
        if pvalue > current_output[2]:
            current_output[2] = pvalue
    else:
        current_hour = phour
        output.append([point[0], pvalue, pvalue])
        current_output = output[-1]

如果您的数据是这样的:

>>> arr
[[datetime.datetime(2015, 4, 26, 0, 0), 0.9627101684867109], [datetime.datetime(2015, 4, 26, 0, 20), 0.8894632247614254], [datetime.datetime(2015, 4, 26, 0, 40), 0.1920554638586589], [datetime.datetime(2015, 4, 26, 1, 0), 0.24394390686092926], [datetime.datetime(2015, 4, 26, 1, 20), 0.9870880292994234], [datetime.datetime(2015, 4, 26, 1, 40), 0.8154734773666351], [datetime.datetime(2015, 4, 26, 2, 0), 0.5074101780070644], [datetime.datetime(2015, 4, 26, 2, 20), 0.6211085118418351], [datetime.datetime(2015, 4, 26, 2, 40), 0.1309246438480619], [datetime.datetime(2015, 4, 26, 3, 0), 0.2042948575387714], [datetime.datetime(2015, 4, 26, 3, 20), 0.90969148583095], [datetime.datetime(2015, 4, 26, 3, 40), 0.9260473796075621], [datetime.datetime(2015, 4, 26, 4, 0), 0.08180604335801178], [datetime.datetime(2015, 4, 26, 4, 20), 0.9909948477818202], [datetime.datetime(2015, 4, 26, 4, 40), 0.6306008554115328], [datetime.datetime(2015, 4, 26, 5, 0), 0.7218791510465083], [datetime.datetime(2015, 4, 26, 5, 20), 0.5751211758007434], [datetime.datetime(2015, 4, 26, 5, 40), 0.8643323785674638], [datetime.datetime(2015, 4, 26, 6, 0), 0.44366887986412196], [datetime.datetime(2015, 4, 26, 6, 20), 0.5845914793227223], [datetime.datetime(2015, 4, 26, 6, 40), 0.9816449110831348], [datetime.datetime(2015, 4, 26, 7, 0), 0.7976769524401801], [datetime.datetime(2015, 4, 26, 7, 20), 0.019715644725192494], [datetime.datetime(2015, 4, 26, 7, 40), 0.774857573501942], [datetime.datetime(2015, 4, 26, 8, 0), 0.971010849289862], [datetime.datetime(2015, 4, 26, 8, 20), 0.9854650056341737], [datetime.datetime(2015, 4, 26, 8, 40), 0.44764478642480565], [datetime.datetime(2015, 4, 26, 9, 0), 0.41757419665518836], [datetime.datetime(2015, 4, 26, 9, 20), 0.2428205990660569], [datetime.datetime(2015, 4, 26, 9, 40), 0.7652296383460859], [datetime.datetime(2015, 4, 26, 10, 0), 0.6148904798625167], [datetime.datetime(2015, 4, 26, 10, 20), 0.5437523646936837], [datetime.datetime(2015, 4, 26, 10, 40), 0.7867821039231312], [datetime.datetime(2015, 4, 26, 11, 0), 0.7178834338473005], [datetime.datetime(2015, 4, 26, 11, 20), 0.4349509857268635], [datetime.datetime(2015, 4, 26, 11, 40), 0.2819549901100772], [datetime.datetime(2015, 4, 26, 12, 0), 0.0849398640248602], [datetime.datetime(2015, 4, 26, 12, 20), 0.6260259998494316], [datetime.datetime(2015, 4, 26, 12, 40), 0.8353818765863841], [datetime.datetime(2015, 4, 26, 13, 0), 0.17232607867607763], [datetime.datetime(2015, 4, 26, 13, 20), 0.17091634151665247], [datetime.datetime(2015, 4, 26, 13, 40), 0.7653484731068122], [datetime.datetime(2015, 4, 26, 14, 0), 0.9510280942218504], [datetime.datetime(2015, 4, 26, 14, 20), 0.2696780695726898], [datetime.datetime(2015, 4, 26, 14, 40), 0.6634142333370054], [datetime.datetime(2015, 4, 26, 15, 0), 0.48395825825107863], [datetime.datetime(2015, 4, 26, 15, 20), 0.7669839652095866], [datetime.datetime(2015, 4, 26, 15, 40), 0.9479268674677883], [datetime.datetime(2015, 4, 26, 16, 0), 0.9046641495205922], [datetime.datetime(2015, 4, 26, 16, 20), 0.045289391652820865], [datetime.datetime(2015, 4, 26, 16, 40), 0.7932951067126703], [datetime.datetime(2015, 4, 26, 17, 0), 0.4419846953059643], [datetime.datetime(2015, 4, 26, 17, 20), 0.11146542138230242], [datetime.datetime(2015, 4, 26, 17, 40), 0.5887496294547572], [datetime.datetime(2015, 4, 26, 18, 0), 0.08733136331114111], [datetime.datetime(2015, 4, 26, 18, 20), 0.7957160332912587], [datetime.datetime(2015, 4, 26, 18, 40), 0.8128833057460692], [datetime.datetime(2015, 4, 26, 19, 0), 0.21977323027233342], [datetime.datetime(2015, 4, 26, 19, 20), 0.20504702851137402], [datetime.datetime(2015, 4, 26, 19, 40), 0.6555892081746738], [datetime.datetime(2015, 4, 26, 20, 0), 0.7380315441194354], [datetime.datetime(2015, 4, 26, 20, 20), 0.8075383278433004], [datetime.datetime(2015, 4, 26, 20, 40), 0.837007721004194], [datetime.datetime(2015, 4, 26, 21, 0), 0.8842141478652727], [datetime.datetime(2015, 4, 26, 21, 20), 0.3349342531521037], [datetime.datetime(2015, 4, 26, 21, 40), 0.811383235093619], [datetime.datetime(2015, 4, 26, 22, 0), 0.8273356582091318], [datetime.datetime(2015, 4, 26, 22, 20), 0.17269590855559502], [datetime.datetime(2015, 4, 26, 22, 40), 0.13561711047456493], [datetime.datetime(2015, 4, 26, 23, 0), 0.8906156794457442], [datetime.datetime(2015, 4, 26, 23, 20), 0.2653437814631542]]

(我使用第二个元素的随机数据复制了它),可以按小时放入存储桶中:

>>> buckets={}
>>> for t in arr: 
...    buckets.setdefault(t[0].hour, []).append(t)

然后对键进行排序,并使用第二个tuple元素作为键来获取最小值,最大值:

>>> for hour in sorted(buckets):
...    print hour, max(buckets[hour], key=lambda l: l[1]), min(buckets[hour], key=lambda l: l[1])

0 [datetime.datetime(2015, 4, 26, 0, 0), 0.9627101684867109] [datetime.datetime(2015, 4, 26, 0, 40), 0.1920554638586589]
1 [datetime.datetime(2015, 4, 26, 1, 20), 0.9870880292994234] [datetime.datetime(2015, 4, 26, 1, 0), 0.24394390686092926]
2 [datetime.datetime(2015, 4, 26, 2, 20), 0.6211085118418351] [datetime.datetime(2015, 4, 26, 2, 40), 0.1309246438480619]
3 [datetime.datetime(2015, 4, 26, 3, 40), 0.9260473796075621] [datetime.datetime(2015, 4, 26, 3, 0), 0.2042948575387714]
4 [datetime.datetime(2015, 4, 26, 4, 20), 0.9909948477818202] [datetime.datetime(2015, 4, 26, 4, 0), 0.08180604335801178]
5 [datetime.datetime(2015, 4, 26, 5, 40), 0.8643323785674638] [datetime.datetime(2015, 4, 26, 5, 20), 0.5751211758007434]
6 [datetime.datetime(2015, 4, 26, 6, 40), 0.9816449110831348] [datetime.datetime(2015, 4, 26, 6, 0), 0.44366887986412196]
7 [datetime.datetime(2015, 4, 26, 7, 0), 0.7976769524401801] [datetime.datetime(2015, 4, 26, 7, 20), 0.019715644725192494]
8 [datetime.datetime(2015, 4, 26, 8, 20), 0.9854650056341737] [datetime.datetime(2015, 4, 26, 8, 40), 0.44764478642480565]
9 [datetime.datetime(2015, 4, 26, 9, 40), 0.7652296383460859] [datetime.datetime(2015, 4, 26, 9, 20), 0.2428205990660569]
10 [datetime.datetime(2015, 4, 26, 10, 40), 0.7867821039231312] [datetime.datetime(2015, 4, 26, 10, 20), 0.5437523646936837]
11 [datetime.datetime(2015, 4, 26, 11, 0), 0.7178834338473005] [datetime.datetime(2015, 4, 26, 11, 40), 0.2819549901100772]
12 [datetime.datetime(2015, 4, 26, 12, 40), 0.8353818765863841] [datetime.datetime(2015, 4, 26, 12, 0), 0.0849398640248602]
13 [datetime.datetime(2015, 4, 26, 13, 40), 0.7653484731068122] [datetime.datetime(2015, 4, 26, 13, 20), 0.17091634151665247]
14 [datetime.datetime(2015, 4, 26, 14, 0), 0.9510280942218504] [datetime.datetime(2015, 4, 26, 14, 20), 0.2696780695726898]
15 [datetime.datetime(2015, 4, 26, 15, 40), 0.9479268674677883] [datetime.datetime(2015, 4, 26, 15, 0), 0.48395825825107863]
16 [datetime.datetime(2015, 4, 26, 16, 0), 0.9046641495205922] [datetime.datetime(2015, 4, 26, 16, 20), 0.045289391652820865]
17 [datetime.datetime(2015, 4, 26, 17, 40), 0.5887496294547572] [datetime.datetime(2015, 4, 26, 17, 20), 0.11146542138230242]
18 [datetime.datetime(2015, 4, 26, 18, 40), 0.8128833057460692] [datetime.datetime(2015, 4, 26, 18, 0), 0.08733136331114111]
19 [datetime.datetime(2015, 4, 26, 19, 40), 0.6555892081746738] [datetime.datetime(2015, 4, 26, 19, 20), 0.20504702851137402]
20 [datetime.datetime(2015, 4, 26, 20, 40), 0.837007721004194] [datetime.datetime(2015, 4, 26, 20, 0), 0.7380315441194354]
21 [datetime.datetime(2015, 4, 26, 21, 0), 0.8842141478652727] [datetime.datetime(2015, 4, 26, 21, 20), 0.3349342531521037]
22 [datetime.datetime(2015, 4, 26, 22, 0), 0.8273356582091318] [datetime.datetime(2015, 4, 26, 22, 40), 0.13561711047456493]
23 [datetime.datetime(2015, 4, 26, 23, 0), 0.8906156794457442] [datetime.datetime(2015, 4, 26, 23, 20), 0.2653437814631542]

如果您的数据已按datetime元素排序,则可以绕过单独的存储段步骤并使用groupby

>>> from itertools import groupby
>>> for hour, group in groupby(arr, lambda t: t[0].hour):
...     li=list(group)
...     print hour, max(li, key=lambda l: l[1]), min(li, key=lambda l: l[1])

您可以使用熊猫。

import pandas as pd
  1. 创建DataFrame并按时间排序

     df = pd.DataFrame(d, columns = ['time', 'price']).sort('time') 

其中d是您输入中的元组列表。

time  price
0 2015-04-26 00:10:00   25.2
1 2015-04-26 00:20:00   25.1
2 2015-04-26 00:30:00   25.7
3 2015-04-26 00:40:00   23.2
4 2015-04-26 00:50:00   22.2
5 2015-04-26 00:59:00   29.2
6 2015-04-26 01:00:00   22.2
7 2015-04-26 01:10:00   21.2
  1. 使用日期和小时信息创建列

     df['day_hour'] = df.apply(lambda r: datetime.datetime(r['time'].year, r['time'].month, r['time'].day, r['time'].hour,0), axis = 1) 
time  price            day_hour
0 2015-04-26 00:10:00   25.2 2015-04-26 00:00:00
1 2015-04-26 00:20:00   25.1 2015-04-26 00:00:00
2 2015-04-26 00:30:00   25.7 2015-04-26 00:00:00
3 2015-04-26 00:40:00   23.2 2015-04-26 00:00:00
4 2015-04-26 00:50:00   22.2 2015-04-26 00:00:00
5 2015-04-26 00:59:00   29.2 2015-04-26 00:00:00
6 2015-04-26 01:00:00   22.2 2015-04-26 01:00:00
7 2015-04-26 01:10:00   21.2 2015-04-26 01:00:00
  1. 删除原始的“时间”列,因为它不在输出中使用

     df = df.drop('time', axis = 1) 
  2. 按日期和小时对数据进行分组

     dfgrouped = df.groupby('day_hour') 
  3. 获取每个date_hour的最大值/最小值

     dfmax = dfgrouped.max() dfmin = dfgrouped.min() 
  4. 在同一day_hour一起加入max / min

     dfout = dfmax.join(dfmin, lsuffix='_max', rsuffix='_min') 
>>> dfout
                     price_max  price_min
day_hour                                 
2015-04-26 00:00:00       29.2       22.2
2015-04-26 01:00:00       22.2       21.2

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM