遍历列表时从列表中删除元素

Question

This part of my code does not scale if dimension gets bigger. 如果尺寸变大，我的代码的这一部分将无法缩放。

I loop over my data and accumulate them every dt time window. 我遍历我的数据，并在每个dt时间窗口对其进行累积。 To do this I compare lower and upper time value. 为此，我比较上下时间值。 When I reach upper bound, I break the for loop for efficiency. 当我达到上限时，我打破了for循环以提高效率。 The next time I run for loop I want to start not from its beginning but from the element I stopped previously, for efficiency. 下次运行循环时，我不想从循环开始，而是从先前停止的元素开始，以提高效率。 How can I do that? 我怎样才能做到这一点？

I tried to remove/pop elements of the list but indexes get messed up. 我试图删除/弹出列表中的元素，但索引变得混乱。 I read that I cannot modify the list I loop over, but my goal seems to be not uncommon so there has to be solution. 我读到我无法修改循环的列表，但是我的目标似乎并不罕见，因此必须找到解决方案。 I don't care about original data list later in my code, I only want optimization of my accumulation. 以后我不在乎代码中的原始数据列表，我只想优化累积量。

# Here I generate data for you to show my problem
from random import randint
import numpy as np

dimension = 200
times = [randint(0, 1000) for p in range(0, dimension)]
times.sort()
values = [randint(0, dimension) for p in range(0, dimension)]
data = [(values[k], times[k]) for k in range(dimension)]
dt = 50.0
t = min(times)
pixels = []
timestamps = []

# this is my problem
while (t <= max(times)):
    accumulator = np.zeros(dimension)
    for idx, content in enumerate(data):
        # comparing lower bound of the 'time' window
        if content[1] >= t:
            # comparing upper bound of the 'time' window
            if (content[1] < t + dt):
                accumulator[content[0]] += 1
                # if I pop the first element from the list after accumulating, indexes are screwed when looping further
                # data.pop(0)
            else:
                # all further entries are bigger because they are sorted
                break

    pixels.append(accumulator)
    timestamps.append(t)
    t += dt

Answer 1

In a simpler form, I think you are trying to do: 以一种简单的形式，我认为您正在尝试做：

In [158]: times=[0, 4, 6, 10]
In [159]: data=np.arange(12)
In [160]: cnt=[0 for _ in times]
In [161]: for i in range(len(times)-1):
     ...:     for d in data:
     ...:         if d>=times[i] and d<times[i+1]:
     ...:             cnt[i]+=1
     ...:             
In [162]: cnt
Out[162]: [4, 2, 4, 0]

And you are trying to make this data loop more efficient by breaking form the loop when d gets too large, and by starting the next loop after items which have already been counted. 而且，您试图通过在d太大时中断循环并在已经计数的项目之后开始下一个循环来使此data循环更有效。

Adding the break is easy as you've done: 完成后，添加中断很容易：

In [163]: cnt=[0 for _ in times]
In [164]: for i in range(len(times)-1):
     ...:     for d in data:
     ...:         if d>=times[i]:
     ...:             if d<times[i+1]:
     ...:                 cnt[i]+=1
     ...:             else:
     ...:                 break

In [165]: cnt
Out[165]: [4, 2, 4, 0]

One way to skip the counted stuff is to replace the for d in data with a index loop; 一种跳过计数的东西的方法是用索引循环替换for d in data的for d in data 。 and keep track of where we stopped last time around: 并跟踪我们上次停止的位置：

In [166]: cnt=[0 for _ in times]
In [167]: start=0
     ...: for i in range(len(times)-1):
     ...:     for j in range(start,len(data)):
     ...:         d = data[j]
     ...:         if d>=times[i]:
     ...:             if d<times[i+1]:
     ...:                 cnt[i]+=1
     ...:             else:
     ...:                 start = j
     ...:                 break
     ...:                 
In [168]: cnt
Out[168]: [4, 2, 4, 0]

A pop based version requires that I work with a list (my data is an array), a requires inserting the value back at the break 一个基于pop的版本要求我使用一个列表（我的data是一个数组），a需要在中断处插入该值

In [186]: datal=data.tolist()
In [187]: cnt=[0 for _ in times]
In [188]: for i in range(len(times)-1):
     ...:     while True:
     ...:         d = datal.pop(0)
     ...:         if d>=times[i]:
     ...:             if d<times[i+1]:
     ...:                 cnt[i]+=1
     ...:             else:
     ...:                 datal.insert(0,d)
     ...:                 break
     ...:             
In [189]: cnt
Out[189]: [4, 2, 4, 0]
In [190]: datal
Out[190]: [10, 11]

This isn't perfect, since I still have items on the list at the end (my times don't cover the whole data range). 这不是完美的，因为最后我仍然有项目在列表中（我的times没有涵盖整个data范围）。 But it tests the idea. 但是它检验了这个想法。

Here's something closer to your attempt: 这更接近您的尝试：

In [203]: for i in range(len(times)-1):
     ...:     for d in datal[:]:
     ...:         if d>=times[i]:
     ...:             if d<times[i+1]:
     ...:                 cnt[i]+=1
     ...:                 datal.pop(0)
     ...:             else:
     ...:                 break
     ...:

The key difference is that I iterate on a copy of datal . 关键区别在于我迭代了datal的副本。 That way the pop affects datal , but doesn't affect the current iteration. 这样， pop会影响datal ，但不会影响当前迭代。 Admittedly there's a cost to the copy, so the speed up might be significant. 不可否认，复制是有成本的，因此提高速度可能会很明显。

A different approach would be to loop on data , and step time as the t and t+dt boundaries are crossed. 另一种方法是在data上循环，并跨越t和t+dt边界时的步time 。

In [222]: times=[0, 4, 6, 10,100]
In [223]: cnt=[0 for _ in times]; i=0
In [224]: for d in data:
     ...:     if d>=times[i]:
     ...:         if d<times[i+1]:
     ...:             cnt[i]+=1
     ...:         else:
     ...:             i += 1
     ...:             cnt[i]+=1
     ...:             
In [225]: cnt
Out[225]: [4, 2, 4, 2, 0]

遍历列表时从列表中删除元素

问题描述

1 个解决方案

解决方案1
0 2016-09-29 18:39:26

遍历列表时从列表中删除元素

问题描述

1 个解决方案

解决方案1 0 2016-09-29 18:39:26

解决方案1
0 2016-09-29 18:39:26