![](/img/trans.png)
[英]How do I merge into one sublist the two sublists with the same index 0?
[英]How do I create a sublist that contains the last element, but uses a general formula for all other sublists of the same size?
我的清單很長,我們稱它為y
。 len(y) = 500
。 我不是故意在代碼中包含y。
對於y中的每個項目,我想找到該項目的平均值及其5個處理值。 當我到達列表中的最后一項時,我遇到了一個問題,因為我需要在下面的其中一行中使用“ a + 1”。
a = 0
SMAlist = []
for each_item in y:
if a > 4 and a < ((len(y))-1): # finding my averages begin at 6th item
b = (y[a-5:a+1]) # this line doesn't work for the last item in y
SMAsix = round((sum(b)/6),2)
SMAlist.append(SMAsix)
if a > ((len(y))-2): # this line seems unnecessary. How can I avoid it?
b = (y[-6:-1]+[y[a]]) # Should I just use negative values in general?
SMAsix = round((sum(b)/6),2)
SMAlist.append(SMAsix)
a = a+1
您可以對列表進行分塊,並在這些分塊上建立平均值。 鏈接的答案使用完整的塊,我對其進行了調整以構建增量塊:
通過列表理解滑動平均值:
# Inspiration for a "full" chunk I adapted: https://stackoverflow.com/a/312464/7505395
def overlappingChunks(l, n):
"""Yield overlapping n-sized chunks from l."""
for i in range(0, len(l)):
yield l[i:i + n]
somenums = [10406.19,10995.72,11162.55,11256.7,11634.98,12174.25,13876.47,
18491.18,16908,15266.43]
# avg over sublist-lengths
slideAvg5 = [ round(sum(part)/(len(part)*1.0),2) for part in overlappingChunks(somenums,6)]
print (slideAvg5)
輸出:
[11271.73, 11850.11, 13099.36, 14056.93, 14725.22, 15343.27, 16135.52,
16888.54, 16087.22, 15266.43]
在對分區取平均之前,我打算按增量range(len(yourlist))
分配列表的一部分,但這就是完全分區已在此處解決的問題: 如何將列表分成均勻大小的塊? 我對其進行了調整以產生增量塊,以將其應用於您的問題。
平均使用哪些分區?
explained = {(idx,tuple(part)): round(sum(part)/(len(part)*1.0),2) for idx,part in
enumerate(overlappingChunks(somenums,6))}
import pprint
pprint.pprint(explained)
輸出(重新格式化):
# Input:
# [10406.19,10995.72,11162.55,11256.7,11634.98,12174.25,13876.47,18491.18,16908,15266.43]
# Index partinioned part of the input list avg
{(0, (10406.19, 10995.72, 11162.55, 11256.7, 11634.98, 12174.25)) : 11271.73,
(1, (10995.72, 11162.55, 11256.7, 11634.98, 12174.25, 13876.47)) : 11850.11,
(2, (11162.55, 11256.7, 11634.98, 12174.25, 13876.47, 18491.18)) : 13099.36,
(3, (11256.7, 11634.98, 12174.25, 13876.47, 18491.18, 16908)) : 14056.93,
(4, (11634.98, 12174.25, 13876.47, 18491.18, 16908, 15266.43)) : 14725.22,
(5, (12174.25, 13876.47, 18491.18, 16908, 15266.43)) : 15343.27,
(6, (13876.47, 18491.18, 16908, 15266.43)) : 16135.52,
(7, (18491.18, 16908, 15266.43)) : 16888.54,
(8, (16908, 15266.43)) : 16087.22,
(9, (15266.43,)) : 15266.43}
選項1:熊貓
import pandas as pd
y = [10406.19,10995.72,11162.55,11256.7,11634.98,12174.25,13876.47,18491.18,16908,15266.43]
series = pd.Series(y)
print(series.rolling(window=6, center=True).mean().dropna().tolist())
選項2:脾氣暴躁
import numpy as np
window=6
s=np.insert(np.cumsum(np.array(y)), 0, [0])
output = (s[window :] - s[:-window]) * (1. / window)
print(list(output))
產量
[11271.731666666667, 11850.111666666666, 13099.355, 14056.930000000002, 14725.218333333332]
時間(視數據大小而定)
# Pandas
59.5 µs ± 8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
# Numpy
19 µs ± 4.38 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
# @PatrickArtner's solution
16.1 µs ± 2.98 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
更新
檢查計時代碼(在Jupyter筆記本上有效)
%%timeit
import pandas as pd
y = [10406.19,10995.72,11162.55,11256.7,11634.98,12174.25,13876.47,18491.18,16908,15266.43]
series = pd.Series(y)
@Vivek Kalyanarangan的“拉鏈”解決方案有點警告。 對於更長的序列,這很容易失去意義。 為了清楚起見,我們使用float32
:
>>> y = (1000 + np.sin(np.arange(1000000))).astype(np.float32)
>>> window=6
>>>
# naive zipper solution
>>> s=np.insert(np.cumsum(np.array(y)), 0, [0])
>>> output = (s[window :] - s[:-window]) * (1. / window)
# towards the end the result is clearly wrong
>>> print(output[-10:])
[1024. 1024. 1024. 1024. 1024. 1024. 1024. 1024. 1024. 1024.]
>>>
# this can be alleviated by first taking the difference and then summing
>>> np.cumsum(np.r_[y[:window].sum(), y[window:]-y[:-window]])/window
array([1000.02936, 999.98285, 999.9521 , ..., 1000.0247 , 1000.05304,
1000.0367 ], dtype=float32)
>>>
# compare to last value calculated directly for reference
>>> np.mean(y[-6:])
1000.03217
為了進一步減少錯誤,可以在不損失太多速度的情況下,將y
塊化並固定每個項。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.