简体   繁体   English

在python中重新定位子列表的最快方法

[英]Fastest way to reposition sublist in python

What is the fastest way to reposition a sublist from a list in Python? 从Python中的列表重新定位子列表的最快方法是什么?

Let's say we have a list L = [a,b,c,d,e,f,g,h] , now I want to take [c,d,e] and put it after g in the list. 假设我们有一个列表L = [a,b,c,d,e,f,g,h] ,现在我想取[c,d,e]并将其放在列表中的g之后。 How can I do this fast ? 我怎么能快速做到这一点?

Edit: In other words I would like to write a function that: 编辑:换句话说,我想写一个函数:

  1. extracts a sublist L_sub of length n from L, leaving L_temp 从L中提取长度为n的子列表L_sub,留下L_temp
  2. insert the items of L_sub at a given position i into L_temp 将L_sub的项目在给定位置i插入L_temp

The main question I guess is how to insert a list into list as fast as possible. 我猜的主要问题是如何尽快将列表插入列表。

I assume OP wants to do this inplace. 我认为OP想要在这里做到这一点。

The key to making the operation fast is to minimize the creation of lists and the shortening/lengthening of lists. 快速操作的关键是最小化列表的创建和列表的缩短/延长。 This means we must strive to always do a 1:1 assignment of list indices, so no L[i:i] = L[a:b] and no L[a:b] = [] . 这意味着我们必须努力始终对列表索引进行1:1分配,因此没有L[i:i] = L[a:b]且没有L[a:b] = [] Using loops with insert and pop is even worse, because then you shorten and lengthen the list many times. 使用带insertpop循环更糟糕,因为那样你会多次缩短和延长列表。 Concatenating lists is also bad because you first have to create one list for each part and then create larger and larger concatenated lists, once for each + . 连接列表也很糟糕,因为您首先必须为每个部分创建一个列表,然后为每个+创建一个更大和更大的连接列表。 Since you want to do this "inplace", you'd have to assign the generated list to L[:] in the end. 由于您希望“就地”执行此操作,因此您必须最终将生成的列表分配给L[:]

    # items:   0 | 1   2   3 | 4   5   6   7 | 8   9
    #            a   span1   b     span2     c
    # pos:       1           4               8

    # Result:
    #          0 | 4   5   6   7 | 1   2   3 | 8   9
    #            a     span2         span2   c

Lets first make an observation. 让我们先做一个观察。 If a = start , b = end = start + length , and c is the insert position, then the operation we wish to do is to cut at the | 如果a = startb = end = start + length ,而c是插入位置,那么我们希望的操作是切入| markers and swap span1 and span2 . 标记和交换span1span2 But if b = start and c = end and a is the insert position, then we also want to swap span1 and span2 . 但是如果b = startc = enda是插入位置,那么我们想要交换span1span2 So in our function, we just deal with two consecutive segments that must be swapped. 所以在我们的函数中,我们只处理必须交换的两个连续段。

We can't wholly avoid making new lists, because we need to store overlapping values while moving stuff around. 我们无法完全避免制作新列表,因为我们需要在移动内容时存储重叠值。 We can however make the list as short as possible, by choosing which of the two spans to store to a temporary list. 但是,我们可以通过选择要存储到临时列表中的两个跨区中的哪一个使列表尽可能短。

def inplace_shift(L, start, length, pos):
    if pos > start + length:
        (a, b, c) = (start, start + length, pos)
    elif pos < start:
        (a, b, c) = (pos, start, start + length)
    else:
        raise ValueError("Cannot shift a subsequence to inside itself")
    if not (0 <= a < b < c <= len(L)):
        msg = "Index check 0 <= {0} < {1} < {2} <= {3} failed."
        raise ValueError(msg.format(a, b, c, len(L)))

    span1, span2 = (b - a, c - b)
    if span1 < span2:
        tmp = L[a:b]
        L[a:a + span2] = L[b:c]
        L[c - span1:c] = tmp
    else:
        tmp = L[b:c]
        L[a + span2:c] = L[a:b]
        L[a:a + span2] = tmp

Kos seems to have made an error in his timings, so I redid them with his code after correcting the arguments (calculating end from start and length ), and these are the results, from slowest to fastest. 科斯似乎在他的时间上犯了一个错误,所以我在纠正了参数(从startlength计算end )之后用他的代码重新编写它们,这些是从最慢到最快的结果。

Nick Craig-Wood: 100 loops, best of 3: 8.58 msec per loop 
vivek: 100 loops, best of 3: 4.36 msec per loop
PaulP.R.O. (deleted?): 1000 loops, best of 3: 838 usec per loop
unbeli: 1000 loops, best of 3: 264 usec per loop
lazyr: 10000 loops, best of 3: 44.6 usec per loop

I have not tested that any of the other approaches yield correct results. 我没有测试任何其他方法产生正确的结果。

I would do it with python substrings 我会用python子串做到这一点

def subshift(L, start, end, insert_at):
    temp = L[start:end]
    L = L[:start] + L[end:]
    return L[:insert_at] + temp + L[insert_at:]

print subshift(['a','b','c','d','e','f','g','h'], 2, 5, 4)

start and end refer to the position of the substring to cut out (end is non-exclusive in the usual python style. insert_at refers to the position to insert the sub string back in again after it has been cut out. startend指向要剪切的子字符串的位置(end在通常的python样式中是非独占的insert_at指的是在剪切后将子字符串重新插入的位置。

I think this will be faster than any solution with iteration in it if the substrings are more than a character or two in length as nice optimised C code is doing the heavy lifting. 我认为如果子字符串超过一个字符或两个长度,这将比任何迭代的解决方案更快,因为优秀的C代码正在进行繁重的工作。

Let's check what we got so far: 让我们看看到目前为止我们得到了什么:

Code

def subshift(L, start, end, insert_at):
    'Nick Craig-Wood'
    temp = L[start:end]
    L = L[:start] + L[end:]
    return L[:insert_at] + temp + L[insert_at:]

# (promising but buggy, needs correction;
# see comments at unbeli's answer)
def unbeli(x, start, end, at): 
    'unbeli'
    x[at:at] = x[start:end]
    x[start:end] = []

def subshift2(L, start, length, pos):
    'PaulP.R.O.'
    temp = pos - length
    S = L[start:length+start]
    for i in range(start, temp):
        L[i] = L[i + length]
    for i in range(0,length):
        L[i + temp] = S[i]
    return L

def shift(L,start,n,i):
    'vivek'
    return L[:start]+L[start+n:i]+L[start:start+n]+L[i:]

Benchmarks: 基准:

> args = range(100000), 3000, 2000, 60000

> timeit subshift(*args)
100 loops, best of 3: 6.43 ms per loop

  > timeit unbeli(*args)
1000000 loops, best of 3: 631 ns per loop

> timeit subshift2(*args)
100 loops, best of 3: 11 ms per loop

> timeit shift(*args)
100 loops, best of 3: 4.28 ms per loop

Here is an alternate inplace solution: 这是一个替代的现场解决方案:

def movesec(l,srcIndex,n,dstIndex):
    if srcIndex+n>dstIndex: raise ValueError("overlapping indexes")
    for i in range(n):
        l.insert(dstIndex+1,l.pop(srcIndex))

    return l


print range(10)
print movesec(range(10),3,2,6)     

Output: 输出:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]    # orginal
[0, 1, 2, 5, 6, 7, 3, 4, 8, 9]    # modified
>>> L = ['a','b','c','d','e','f','g','h']
>>> L[7:7] = L[2:5]
>>> L[2:5] = []
>>> L
['a', 'b', 'f', 'g', 'c', 'd', 'e', 'h']
def shift(L,start,n,i):
    return L[:start]+L[start+n:i]+L[start:start+n]+L[i:]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM