简体   繁体   English

如何在Python 3中计算移动平均线?

[英]How to calculate moving average in Python 3?

Let's say I have a list: 假设我有一个清单:

y = ['1', '2', '3', '4','5','6','7','8','9','10']

I want to create a function that calculates the moving n-day average. 我想创建一个计算移动n天平均值的函数。 So if n was 5, I would want my code to calculate the first 1-5, add it and find the average, which would be 3.0, then go on to 2-6, calculate the average, which would be 4.0, then 3-7, 4-8, 5-9, 6-10. 因此,如果n是5,我希望我的代码计算前1-5,添加它并找到平均值,这将是3.0,然后继续到2-6,计算平均值,这将是4.0,然后3 -7,4-8,5-9,6-10。

I don't want to calculate the first n-1 days, so starting from the nth day, it'll count the previous days. 我不想计算前n-1天,所以从第n天开始,它将计算前几天。

def moving_average(x:'list of prices', n):
    for num in range(len(x)+1):
        print(x[num-n:num])

This seems to print out what I want: 这似乎打印出我想要的东西:

[]
[]
[]
[]
[]

['1', '2', '3', '4', '5']

['2', '3', '4', '5', '6']

['3', '4', '5', '6', '7']

['4', '5', '6', '7', '8']

['5', '6', '7', '8', '9']

['6', '7', '8', '9', '10']

However, I don't know how to calculate the numbers inside those lists. 但是,我不知道如何计算这些列表中的数字。 Any ideas? 有任何想法吗?

There is a great sliding window generator in an old version of the Python docs with itertools examples : 旧版本的Python文档中有一个很棒的滑动窗口生成器,带有itertools示例

from itertools import islice

def window(seq, n=2):
    "Returns a sliding window (of width n) over data from the iterable"
    "   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "
    it = iter(seq)
    result = tuple(islice(it, n))
    if len(result) == n:
        yield result    
    for elem in it:
        result = result[1:] + (elem,)
        yield result

Using that your moving averages is trivial: 使用你的移动平均线是微不足道的:

from __future__ import division  # For Python 2

def moving_averages(values, size):
    for selection in window(values, size):
        yield sum(selection) / size

Running this against your input (mapping the strings to integers) gives: 针对您的输入运行此命令(将字符串映射到整数)给出:

>>> y= ['1', '2', '3', '4','5','6','7','8','9','10']
>>> for avg in moving_averages(map(int, y), 5):
...     print(avg)
... 
3.0
4.0
5.0
6.0
7.0
8.0

To return None the first n - 1 iterations for 'incomplete' sets, just expand the moving_averages function a little: 要返回“ None完成”集合的前n - 1次迭代,只需稍微扩展moving_averages函数:

def moving_averages(values, size):
    for _ in range(size - 1):
        yield None
    for selection in window(values, size):
        yield sum(selection) / size

While I like Martijn's answer on this, like george, I was wondering if this wouldn't be faster by using a running summation instead of applying the sum() over and over again on mostly the same numbers. 虽然我喜欢Martijn对此的回答 ,就像乔治一样,我想知道如果使用运行求和而不是在大多数相同的数字上反复应用sum() ,这是否会更快。

Also the idea of having None values as default during the ramp up phase is interesting. 此外,在加速阶段将None值设为默认值的想法很有意思。 In fact there may be plenty of different scenarios one could conceive for moving averages. 实际上,可能存在许多可以设想移动平均线的不同场景。 Let's split the calculation of averages into three phases: 我们将平均值的计算分为三个阶段:

  1. Ramp Up: Starting iterations where the current iteration count < window size 加速:开始迭代,其中当前迭代计数<窗口大小
  2. Steady Progress: We have exactly window size number of elements available to calculate a normal average := sum(x[iteration_counter-window_size:iteration_counter])/window_size 稳定进展:我们有精确的窗口大小可用于计算正常average := sum(x[iteration_counter-window_size:iteration_counter])/window_size的元素average := sum(x[iteration_counter-window_size:iteration_counter])/window_size
  3. Ramp Down: At the end of the input data, we could return another window_size - 1 "average" numbers. 斜坡下降:在输入数据的末尾,我们可以返回另一个window_size - 1 “平均”数字。

Here's a function that accepts 这是一个接受的功能

  • Arbitrary iterables (generators are fine) as input for data 任意迭代(生成器很好)作为数据的输入
  • Arbitrary window sizes >= 1 任意窗口大小> = 1
  • Parameters to switch on/off production of values during the phases for Ramp Up/Down 用于在Ramp Up / Down阶段期间打开/关闭值生成的参数
  • Callback functions for those phases to control how values are produced. 这些阶段的回调函数用于控制值的生成方式。 This can be used to constantly provide a default (eg None ) or to provide partial averages 这可用于不断提供默认值(例如, None )或提供部分平均值

Here's the code: 这是代码:

from collections import deque 

def moving_averages(data, size, rampUp=True, rampDown=True):
    """Slide a window of <size> elements over <data> to calc an average

    First and last <size-1> iterations when window is not yet completely
    filled with data, or the window empties due to exhausted <data>, the
    average is computed with just the available data (but still divided
    by <size>).
    Set rampUp/rampDown to False in order to not provide any values during
    those start and end <size-1> iterations.
    Set rampUp/rampDown to functions to provide arbitrary partial average
    numbers during those phases. The callback will get the currently
    available input data in a deque. Do not modify that data.
    """
    d = deque()
    running_sum = 0.0

    data = iter(data)
    # rampUp
    for count in range(1, size):
        try:
            val = next(data)
        except StopIteration:
            break
        running_sum += val
        d.append(val)
        #print("up: running sum:" + str(running_sum) + "  count: " + str(count) + "  deque: " + str(d))
        if rampUp:
            if callable(rampUp):
                yield rampUp(d)
            else:
                yield running_sum / size

    # steady
    exhausted_early = True
    for val in data:
        exhausted_early = False
        running_sum += val
        #print("st: running sum:" + str(running_sum) + "  deque: " + str(d))
        yield running_sum / size
        d.append(val)
        running_sum -= d.popleft()

    # rampDown
    if rampDown:
        if exhausted_early:
            running_sum -= d.popleft()
        for (count) in range(min(len(d), size-1), 0, -1):
            #print("dn: running sum:" + str(running_sum) + "  deque: " + str(d))
            if callable(rampDown):
                yield rampDown(d)
            else:
                yield running_sum / size
            running_sum -= d.popleft()

It seems to be a bit faster than Martijn's version - which is far more elegant, though. 它似乎比Martijn的版本快一点 - 尽管它更加优雅。 Here's the test code: 这是测试代码:

print("")
print("Timeit")
print("-" * 80)

from itertools import islice
def window(seq, n=2):
    "Returns a sliding window (of width n) over data from the iterable"
    "   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "
    it = iter(seq)
    result = tuple(islice(it, n))
    if len(result) == n:
        yield result    
    for elem in it:
        result = result[1:] + (elem,)
        yield result

# Martijn's version:
def moving_averages_SO(values, size):
    for selection in window(values, size):
        yield sum(selection) / size


import timeit
problems = [int(i) for i in (10, 100, 1000, 10000, 1e5, 1e6, 1e7)]
for problem_size in problems:
    print("{:12s}".format(str(problem_size)), end="")

    so = timeit.repeat("list(moving_averages_SO(range("+str(problem_size)+"), 5))", number=1*max(problems)//problem_size,
                       setup="from __main__ import moving_averages_SO")
    print("{:12.3f} ".format(min(so)), end="")

    my = timeit.repeat("list(moving_averages(range("+str(problem_size)+"), 5, False, False))", number=1*max(problems)//problem_size,
                       setup="from __main__ import moving_averages")
    print("{:12.3f} ".format(min(my)), end="")

    print("")

And the output: 并输出:

Timeit
--------------------------------------------------------------------------------
10                 7.242        7.656 
100                5.816        5.500 
1000               5.787        5.244 
10000              5.782        5.180 
100000             5.746        5.137 
1000000            5.745        5.198 
10000000           5.764        5.186 

The original question can now be solved with this function call: 现在可以使用此函数调用解决原始问题:

print(list(moving_averages(range(1,11), 5,
                           rampUp=lambda _: None,
                           rampDown=False)))

The output: 输出:

[None, None, None, None, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]

Use the sum and map functions. 使用summap函数。

print(sum(map(int, x[num-n:num])))

The map function in Python 3 is basically a lazy version of this: Python 3中的map函数基本上是一个懒惰的版本:

[int(i) for i in x[num-n:num]]

I'm sure you can guess what the sum function does. 我相信你可以猜出sum函数的作用。

An approach that avoids recomputing intermediate sums.. 一种避免重新计算中间总和的方法..

list=range(0,12)
def runs(v):
 global runningsum
 runningsum+=v
 return(runningsum)
runningsum=0
runsumlist=[ runs(v) for v in list ]
result = [ (runsumlist[k] - runsumlist[k-5])/5 for k in range(0,len(list)+1)]

print result 打印结果

[2,3,4,5,6,7,8,9]

make that runs(int(v)) .. then .. repr( runsumlist[k] - runsumlist[k-5])/5 ) if you ant to carry around numbers a strings.. make that(int(v))..然后.. repr(runsumlist [k] - runsumlist [k-5])/ 5)如果你蚂蚁携带数字一个字符串..


Alt without the global: 没有全局的Alt:

list = [float[x] for x in range(0,12)]
nave = 5
movingave = sum(list[:nave]/nave)
for i in range(len(list)-nave):movingave.append(movingave[-1]+(list[i+nave]-list[i])/nave)
print movingave 

be sure to do floating math even if you input values are integers 即使您输入的值是整数,也一定要进行浮动数学运算

[2.0,3.0,4.0,5.0,6.0,7.0,8.0,9,0]

There is another solution extending an itertools recipe pairwise() . 还有另一种解决方案是pairwise()扩展itertools配方pairwise() You can extend this to nwise() , which gives you the sliding window (and works if the iterable is a generator): 您可以将其扩展为nwise() ,它为您提供滑动窗口(如果iterable是生成器,则可以工作):

def nwise(iterable, n):
    ts = it.tee(iterable, n)
    for c, t in enumerate(ts):
        next(it.islice(t, c, c), None)
    return zip(*ts)

def moving_averages_nw(iterable, n):
    yield from (sum(x)/n for x in nwise(iterable, n))

>>> list(moving_averages_nw(range(1, 11), 5))
[3.0, 4.0, 5.0, 6.0, 7.0, 8.0]

While a relatively high setup cost for short iterable s this cost reduces in impact the longer the data set. 虽然短iterable的设置成本相对较高,但这种成本会降低影响,数据集的时间越长。 This uses sum() but the code is reasonably elegant: 这使用sum()但代码相当优雅:

Timeit              MP           cfi         *****
--------------------------------------------------------------------------------
10                 4.658        4.959        7.351 
100                5.144        4.070        4.234 
1000               5.312        4.020        3.977 
10000              5.317        4.031        3.966 
100000             5.508        4.115        4.087 
1000000            5.526        4.263        4.202 
10000000           5.632        4.326        4.242 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM