简体   繁体   English

在给定条件的情况下递增前n个列表元素

[英]Increment first n list elements given a condition

I have a list for example 我有一个列表例如

l = [10, 20, 30, 40, 50, 60]

I need to increment the first n elements of the list given a condition. 我需要在给定条件的情况下递增列表的前n元素。 The condition is independent of the list. 该条件与列表无关。 For example if n = 3 , the list l should become : 例如,如果n = 3 ,列表l应该变为:

l = [11, 21, 31, 40, 50, 60]

I understand that I can do it with a for loop on each element of the list. 我知道我可以在列表的每个元素上使用for循环来完成它。 But I need to do such operation around 150 million times. 但我需要做大约1.5亿次这样的操作。 So, I am looking for a faster method to do this. 所以,我正在寻找一种更快的方法来做到这一点。 Any help is highly appreciated. 任何帮助都非常感谢。 Thanks in advance 提前致谢

You can create a simple data structure on top of your list which stores the start and end range of each increment operation. 您可以在列表顶部创建一个简单的数据结构,该结构存储每个增量操作的开始和结束范围。 The start would be 0 in your case so you can just store the end. 在你的情况下,开始将是0,所以你可以存储结束。

This way you don't have to actually traverse the list to increment the elements, but you only retain that you performed increments on ranges for example {0 to 2} and {0 to 3}. 这样,您不必实际遍历列表以增加元素,但您只保留在范围上执行增量,例如{0到2}和{0到3}。 Furthermore, you can also collate some operations, so that if multiple operations increment until the same index, you only need to store one entry. 此外,您还可以整理一些操作,这样,如果多个操作递增到相同的索引,您只需要存储一个条目。

The worst case complexity of this solution is O(q + gx qlogq + n) where g is the number of get operations, q is the number of updates and n is the length of the list. 该解决方案的最坏情况复杂度是O(q + gx qlogq + n)其中g是get操作的数量,q是更新的数量,n是列表的长度。 Since we can have at most n distinct endings for the intervals this reduces to O(q + nlogn + n) = O(q + nlogn) . 由于我们可以在间隔中具有至多n个不同的结尾,因此减少到O(q + nlogn + n) = O(q + nlogn) A naive solution using an update for each query would be O(q * l ) where l (the length of a query) could be up to the size of n giving O(q * n) . 使用每个查询的更新的天真解决方案将是O(q * l ),其中l(查询的长度)可以达到给出O(q * n)的n的大小。 So we can expect this solution to be better when q > log n . 因此,当q > log n时,我们可以期望此解决方案更好。

Working python example below: 下面的工作python示例:

def RangeStructure(object):

  def __init__(self, l):
    self.ranges = collections.defaultdict(int)
    self.l = l

  def incToPosition(self, k):
    self.ranges[k] += 1

  def get(self):
    res = self.l
    sorted_keys = sorted(self.ranges)
    last = len(sorted_keys) - 1                                                                                                                                                                                                                
    to_add = 0
    while last >= 0:
        start = 0 if last < 1 else sorted_keys[last - 1]
        end = sorted_keys[last]
        to_add += self.ranges[end]
        for i in range(start, end):
            res[i] += to_add
        last -= 1
    return res

rs = RangeStructure([10, 20, 30, 40, 50, 60])
rs.incToPosition(2)
rs.incToPosition(2)
rs.incToPosition(3)
rs.incToPosition(4)
print rs.get()

And an explanation: 并解释说:

  1. after the inc operations ranges will contain (start, end, inc) tuples of the form (0, 2, 2), (0, 3, 1), (0, 4, 1); 在inc操作范围之后将包含(start,end,inc)形式(0,2,2),(0,3,1),(0,4,1)的元组; these will be represented in the dict as { 2:2, 3:1, 4:1} since the start is always 1 and can be omitted 这些将在dict中表示为{2:2,3:1,4:1},因为起始始终为1并且可以省略

  2. during the get operation, we ensure that we only operate on any list element once; get操作期间,我们确保只对一个列表元素进行一次操作; we sort the ranges in increasing order of their end point, and traverse them in reverse order updating the contained list elements and the sum ( to_add ) to be added to subsequent ranges 我们按照其终点的递增顺序对范围进行排序,并以相反的顺序遍历它们,更新包含的列表元素和要添加到后续范围的总和( to_add

This prints, as expected: 按预期打印:

[14, 24, 32, 41, 50, 60]

Here's an operation-aggregating implementation in NumPy: 这是NumPy中的操作聚合实现:

initial_array = # whatever your l is, but as a NumPy array
increments = numpy.zeros_like(initial_array)
...
# every time you want to increment the first n elements
if n:
    increments[n-1] += 1
...
# to apply the increments
initial_array += increments[::-1].cumsum()[::-1]

This is O(ops + len(initial_array)) , where ops is the number of increment operations. 这是O(ops + len(initial_array)) ,其中ops是递增操作的数量。 Unless you're only doing a small number of increments over a very small portion of the list, this should be much faster. 除非你只是在列表的一小部分上做了少量的增量,否则这应该快得多。 Unlike the naive implementation, it doesn't let you retrieve element values until the increments are applied; 与天真的实现不同,它不允许您在应用增量之前检索元素值; if you need to do that, you might need a solution based on a BST or BST-like structure to track increments. 如果您需要这样做,您可能需要基于BST或类似BST结构的解决方案来跟踪增量。

m - queries count, n - list to increment length, O(n + m) algorithm idea: m - 查询计数,n - 列表增加长度,O(n + m)算法思路:
since you only have to increment from start to some k-th element you will get ranges of increments. 因为你只需要从开始增加到某个第k个元素,你将获得增量范围。 Let our increment be pair (up to position, increment by). 让我们的增量成对(达到位置,递增)。 Example: 例:
(1, 2) - increment positions 0 and 1 by 2 (1,2) - 将位置0和1递增2
If we are trying to calculate value at position k then we should add increments that have positions greater or equal than k to current value at position k. 如果我们试图计算位置k处的值,那么我们应该将位置大于或等于k的增量添加到位置k处的当前值。 How we can quickly calculate sum of increments that have positions greater or equal than k? 我们如何快速计算位置大于或等于k的增量之和? We can start calculating values from the back of the list and then remember sum of increments. 我们可以从列表的后面开始计算值,然后记住增量的总和。
Proof of concept: 概念证明:

# list to increment
a = [1, 2, 5, 1, 6]
# (up to and including k-th index, increment by value)
queries = [(1, 2), (0, 10), (3, 11), (4, 3)]

# decribed algorithm implementation
increments = [0]*len(a)
for position, inc in queries:
    increments[position] += inc

got = list(a)
increments_sum = 0
for i in xrange(len(increments) -1, -1, -1):
    increments_sum += increments[i]
    got[i] += increments_sum


# verify that solution is correct using slow but correct algorithm
expected = list(a)
for position, inc in queries:
    for i in xrange(position + 1):
        expected[i] += inc

print 'Expected: ', expected
print 'Got:      ', got

output: 输出:

Expected:  [27, 18, 19, 15, 9]
Got:       [27, 18, 19, 15, 9]

您可以使用列表推导并添加剩余列表

[x + 1 for x in a[:n]]+a[n:]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM