简体   繁体   English

确定哪些整数在 python 列表中顺序不正确

[英]determining which integers are out of sequence in a python list

I have a list of indexes:我有一个索引列表:

[24, 175, 78, 80, 659, 126, 141, 149, 29, 158, 178, 179]

I want to know how to identify (and remove) which ones are out of sequence.我想知道如何识别(和删除)哪些是乱序的。

desired outcome:期望的结果:

[24, 78, 80, 126, 141, 149, 158, 178, 179]

As a human, I see that 175, 659, and 29 stand out, but am unsure of how to do this programmatically.作为一个人,我看到 175、659 和 29 脱颖而出,但不确定如何以编程方式执行此操作。 I have tried a pairwise comparison (a subset of the index returning the first value if sub_arr[0] < sub_arr[1] .我已经尝试了成对比较(如果sub_arr[0] < sub_arr[1]则索引的子集返回第一个值。

new_ls = []

def eval_adjacent(ls):
    if ls[1] > ls[0]:
        return ls[0]

for n, ele in enumerate(idx_ls[:-1]):
    res = eval_adjacent(idx_ls[n:n+2])
    if res:
        new_ls.append(res)

However, if the integer is less than it should be, this won't work (29).但是,如果整数小于应有的整数,这将不起作用 (29)。 Have thought about iterating in both directions but am starting to think this is not the way to go.考虑过在两个方向上进行迭代,但我开始认为这不是可行的方法。

I think comparing it to sorted(ls) is potentially easier - but am not sure how to select the ones which are desirable (rejecting the remainder).我认为将它与 sorted(ls) 进行比较可能更容易 - 但我不确定如何选择所需的(拒绝其余部分)。

Can anyone point me in the right direction?谁能指出我正确的方向?

It seems like you want the longest increasing subsequence .看起来你想要最长的递增子序列

Try this:试试这个:

from math import floor


# from https://en.wikipedia.org/wiki/Longest_increasing_subsequence
def lis(X):
    N = len(X)
    P = [0] * N
    M = [0] * N
    M[0] = -1

    L = 0
    for i in range(N):
        lo = 1
        hi = L + 1
        while lo < hi:
            mid = lo + floor((hi-lo)/2)
            if X[M[mid]] > X[i]:
                hi = mid
            else:
                lo = mid + 1

        newL = lo

        P[i] = M[newL-1]
        M[newL] = i

        if newL > L:
            L = newL

    S = [0] * N
    k = M[L]
    for j in range(L-1, -1, -1):
        S[j] = X[k]
        k = P[k]

    S = [el for el in S if el != 0]
    return S


data = [24, 175, 78, 80, 659, 126, 141, 149, 29, 158, 178, 179]
print(lis(data))  # => [24, 78, 80, 126, 141, 149, 158, 178, 179]

You can use dynamic programming :您可以使用dynamic programming

def long_inc_seq(lst):
    dp = [[n] for n in lst]
    for i in range(len(lst)):
        for j in range(i):
            if lst[i] > lst[j] and len(dp[i]) < len(dp[j]) + 1:
                dp[i] = dp[j] + [lst[i]]
    return max(dp, key=len)

result = long_inc_seq([24, 175, 78, 80, 659, 126, 141, 149, 29, 158, 178, 179])
print(result)

Output:输出:

[24, 78, 80, 126, 141, 149, 158, 178, 179]

For explanation:解释:

# The inside of dp for the above example:
>>> dp
[[24],
 [24, 175],
 [24, 78],
 [24, 78, 80],
 [24, 78, 80, 659],
 [24, 78, 80, 126],
 [24, 78, 80, 126, 141],
 [24, 78, 80, 126, 141, 149],
 [24, 29],
 [24, 78, 80, 126, 141, 149, 158],
 [24, 78, 80, 126, 141, 149, 158, 178],
 [24, 78, 80, 126, 141, 149, 158, 178, 179]]

You can use list comprehension and windowed from more_itertools:您可以使用 more_itertools 中的列表理解窗口化

  • the windowed function allows you to create a sliding window of the set windowed函数允许您创建集合的滑动窗口
  • each sliding window is examined for numbers that are out of sequence检查每个滑动窗口是否有乱序的数字
  • these numbers are listed in the variable called nums_to_reject这些数字列在名为nums_to_reject的变量中
  • and are then deleted from the original list of indices to produce the result然后从原始indices列表中删除以产生result

Code:代码:

from more_itertools import windowed

indices = [24, 175, 78, 80, 659, 126, 141, 149, 29, 158, 178, 179]

nums_to_reject = [item[1] for item in windowed(indices, 3) if item[0] < item[2] < item[1] or item[1] < item[2] > item[0] > item[1]]    
result = sorted(list(set(indices) - set(nums_to_reject)))        

print(result)

Output:输出:

[24, 78, 80, 126, 141, 149, 158, 178, 179]

Here's a method that I think can be understood by any type of python programmer: (I'm assuming that there are no 2 subsequent numbers out of sequence)这是我认为任何类型的 python 程序员都可以理解的方法:(我假设没有 2 个顺序不正确的后续数字)

my_list = [24, 175, 78, 80, 659, 126, 141, 149, 29, 158, 178, 179]
my_list2 = []
for i in range(len(my_list)):
    try:
        if not (my_list[i + 1] > my_list[i] and my_list[i + 1] > my_list[i + 2]):
            my_list2.append(my_list[i + 1])
    except(IndexError):  # we add this to prevent INDEX OUT OF RANGE exception
        pass
print(my_list2)

OUTPUT输出

[78, 80, 126, 141, 29, 158, 178]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM