Let's say I have a list:
y = ['1', '2', '3', '4','5','6','7','8','9','10']
I want to create a function that calculates the moving n-day average. So if n
was 5, I would want my code to calculate the first 1-5, add it and find the average, which would be 3.0, then go on to 2-6, calculate the average, which would be 4.0, then 3-7, 4-8, 5-9, 6-10.
I don't want to calculate the first n-1 days, so starting from the nth day, it'll count the previous days.
def moving_average(x:'list of prices', n):
for num in range(len(x)+1):
print(x[num-n:num])
This seems to print out what I want:
[]
[]
[]
[]
[]
['1', '2', '3', '4', '5']
['2', '3', '4', '5', '6']
['3', '4', '5', '6', '7']
['4', '5', '6', '7', '8']
['5', '6', '7', '8', '9']
['6', '7', '8', '9', '10']
However, I don't know how to calculate the numbers inside those lists. Any ideas?
There is a great sliding window generator in an old version of the Python docs with itertools
examples :
from itertools import islice
def window(seq, n=2):
"Returns a sliding window (of width n) over data from the iterable"
" s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ... "
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
Using that your moving averages is trivial:
from __future__ import division # For Python 2
def moving_averages(values, size):
for selection in window(values, size):
yield sum(selection) / size
Running this against your input (mapping the strings to integers) gives:
>>> y= ['1', '2', '3', '4','5','6','7','8','9','10']
>>> for avg in moving_averages(map(int, y), 5):
... print(avg)
...
3.0
4.0
5.0
6.0
7.0
8.0
To return None
the first n - 1
iterations for 'incomplete' sets, just expand the moving_averages
function a little:
def moving_averages(values, size):
for _ in range(size - 1):
yield None
for selection in window(values, size):
yield sum(selection) / size
While I like Martijn's answer on this, like george, I was wondering if this wouldn't be faster by using a running summation instead of applying the sum()
over and over again on mostly the same numbers.
Also the idea of having None
values as default during the ramp up phase is interesting. In fact there may be plenty of different scenarios one could conceive for moving averages. Let's split the calculation of averages into three phases:
average := sum(x[iteration_counter-window_size:iteration_counter])/window_size
window_size - 1
"average" numbers. Here's a function that accepts
None
) or to provide partial averages Here's the code:
from collections import deque
def moving_averages(data, size, rampUp=True, rampDown=True):
"""Slide a window of <size> elements over <data> to calc an average
First and last <size-1> iterations when window is not yet completely
filled with data, or the window empties due to exhausted <data>, the
average is computed with just the available data (but still divided
by <size>).
Set rampUp/rampDown to False in order to not provide any values during
those start and end <size-1> iterations.
Set rampUp/rampDown to functions to provide arbitrary partial average
numbers during those phases. The callback will get the currently
available input data in a deque. Do not modify that data.
"""
d = deque()
running_sum = 0.0
data = iter(data)
# rampUp
for count in range(1, size):
try:
val = next(data)
except StopIteration:
break
running_sum += val
d.append(val)
#print("up: running sum:" + str(running_sum) + " count: " + str(count) + " deque: " + str(d))
if rampUp:
if callable(rampUp):
yield rampUp(d)
else:
yield running_sum / size
# steady
exhausted_early = True
for val in data:
exhausted_early = False
running_sum += val
#print("st: running sum:" + str(running_sum) + " deque: " + str(d))
yield running_sum / size
d.append(val)
running_sum -= d.popleft()
# rampDown
if rampDown:
if exhausted_early:
running_sum -= d.popleft()
for (count) in range(min(len(d), size-1), 0, -1):
#print("dn: running sum:" + str(running_sum) + " deque: " + str(d))
if callable(rampDown):
yield rampDown(d)
else:
yield running_sum / size
running_sum -= d.popleft()
It seems to be a bit faster than Martijn's version - which is far more elegant, though. Here's the test code:
print("")
print("Timeit")
print("-" * 80)
from itertools import islice
def window(seq, n=2):
"Returns a sliding window (of width n) over data from the iterable"
" s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ... "
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
# Martijn's version:
def moving_averages_SO(values, size):
for selection in window(values, size):
yield sum(selection) / size
import timeit
problems = [int(i) for i in (10, 100, 1000, 10000, 1e5, 1e6, 1e7)]
for problem_size in problems:
print("{:12s}".format(str(problem_size)), end="")
so = timeit.repeat("list(moving_averages_SO(range("+str(problem_size)+"), 5))", number=1*max(problems)//problem_size,
setup="from __main__ import moving_averages_SO")
print("{:12.3f} ".format(min(so)), end="")
my = timeit.repeat("list(moving_averages(range("+str(problem_size)+"), 5, False, False))", number=1*max(problems)//problem_size,
setup="from __main__ import moving_averages")
print("{:12.3f} ".format(min(my)), end="")
print("")
And the output:
Timeit
--------------------------------------------------------------------------------
10 7.242 7.656
100 5.816 5.500
1000 5.787 5.244
10000 5.782 5.180
100000 5.746 5.137
1000000 5.745 5.198
10000000 5.764 5.186
The original question can now be solved with this function call:
print(list(moving_averages(range(1,11), 5,
rampUp=lambda _: None,
rampDown=False)))
The output:
[None, None, None, None, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]
Use the sum
and map
functions.
print(sum(map(int, x[num-n:num])))
The map
function in Python 3 is basically a lazy version of this:
[int(i) for i in x[num-n:num]]
I'm sure you can guess what the sum
function does.
An approach that avoids recomputing intermediate sums..
list=range(0,12)
def runs(v):
global runningsum
runningsum+=v
return(runningsum)
runningsum=0
runsumlist=[ runs(v) for v in list ]
result = [ (runsumlist[k] - runsumlist[k-5])/5 for k in range(0,len(list)+1)]
print result
[2,3,4,5,6,7,8,9]
make that runs(int(v)) .. then .. repr( runsumlist[k] - runsumlist[k-5])/5 ) if you ant to carry around numbers a strings..
Alt without the global:
list = [float[x] for x in range(0,12)]
nave = 5
movingave = sum(list[:nave]/nave)
for i in range(len(list)-nave):movingave.append(movingave[-1]+(list[i+nave]-list[i])/nave)
print movingave
be sure to do floating math even if you input values are integers
[2.0,3.0,4.0,5.0,6.0,7.0,8.0,9,0]
There is another solution extending an itertools
recipe pairwise()
. You can extend this to nwise()
, which gives you the sliding window (and works if the iterable is a generator):
def nwise(iterable, n):
ts = it.tee(iterable, n)
for c, t in enumerate(ts):
next(it.islice(t, c, c), None)
return zip(*ts)
def moving_averages_nw(iterable, n):
yield from (sum(x)/n for x in nwise(iterable, n))
>>> list(moving_averages_nw(range(1, 11), 5))
[3.0, 4.0, 5.0, 6.0, 7.0, 8.0]
While a relatively high setup cost for short iterable
s this cost reduces in impact the longer the data set. This uses sum()
but the code is reasonably elegant:
Timeit MP cfi *****
--------------------------------------------------------------------------------
10 4.658 4.959 7.351
100 5.144 4.070 4.234
1000 5.312 4.020 3.977
10000 5.317 4.031 3.966
100000 5.508 4.115 4.087
1000000 5.526 4.263 4.202
10000000 5.632 4.326 4.242
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.