简体   繁体   English

如何在列表中查找项目数组并在python中查找下一个项目

[英]How to find an array of items in a list and find the next item in python

I have a list such as ["scissors", "rock", "rock", "paper", "rock" ... (more than 100,000 items)] and want to find an array of item such as ["rock", rock", "paper"] in the list, find all the same patterns, and identify the next items following the pattern in all cases. 我有一个列表,例如["scissors", "rock", "rock", "paper", "rock" ... (more than 100,000 items)]并且想要查找一个数组,例如["rock", rock", "paper"] ,在所有情况下找到所有相同的模式,并确定该模式之后的下一项。

For example, 例如,

original list = ["scissors", "rock", "rock", "paper", "rock", "scissors", "rock", "paper", "scissors"] 

the pattern I want to identify = ["rock", "paper"] (there are 2 in the list above) 我要识别的模式= ["rock", "paper"] (上面的列表中有2个)

the eventual next items of the patterns I'm looking for = "rock" and "scissors". 我正在寻找的图案的最终下一个项目是“摇滚”和“剪刀”。

How could I code this? 我该如何编码?

This will collect the indices where the first part of the pattern occurred: 这将收集模式第一部分发生的索引:

original = ["scissors", "rock", "rock", "paper",
            "rock", "scissors", "rock", "paper", "scissors"]

pattern = ["rock", "paper"]

indices = []
for index, item in enumerate(original):
    if item == pattern[0] and index + 1 < len(original):
        if original[index + 1] == pattern[1]:
            indices.append(index)

# where in the original is the pattern found? starting at index...
print(indices)

# how many times was the pattern found?
print(len(indices))

# Result:    
# [[2, 6]
# 2

If you want to look for several patterns and identify where each one occurred in the original list and how often: 如果要查找几种模式并确定每种模式在原始列表中出现的位置以及频率:

original = ["scissors", "rock", "rock", "paper",
            "rock", "scissors", "rock", "paper", "scissors"]

patterns = [["rock", "paper"], ["scissors", "rock"]]


def look_for_patterns(ori, pat):
    indices = []
    length = len(ori)
    for p in pat:
        sublst = []
        for index, item in enumerate(ori):
            if item == p[0] and index + 1 < length:
                if ori[index + 1] == p[1]:
                    sublst.append(index)
        indices.append(sublst)
    return indices, [len(i) for i in indices]

# where in the original is the pattern found? starting at index...
print(look_for_patterns(original, patterns)[0])

# how many times was the pattern found?
print(look_for_patterns(original, patterns)[1])

# Result:    
# [[2, 6], [0, 5]]
# [2, 2]

How about this, 这个怎么样,

def indexofsublist(l, sublist):
    l1, l2 = len(l), len(sublist)
    for idx in range(l1):
        if l[idx:idx+l2] == sublist:
            return idx

original_list = ["scissors", "rock", "rock", "paper", "rock", "scissors", "rock", "paper", "scissors"] 
identify = ["rock", "paper"]

idx = indexofsublist(original_list, identify)   # 2
next_idx = idx + len(identify)                  # 4
target = ["rock", "paper"]
original = ["scissors", "rock", "rock", "paper", "rock", "scissors", "rock", "paper", "scissors"]

def combinations(iterable, length):
    return [iterable[i: i + length] for i in range(len(iterable) - length + 1)]

combinations = combinations(original, 2)
indices = [i for i, x in enumerate(combinations) if x == target]

for index in indices:
    print(combinations[index+1][-1])

output: 输出:

rock
scissors

What did the code do: 该代码做了什么:

  1. Uses method combinations to print all 2 consecutive elements combinations 使用方法combinations来打印所有2个连续的元素组合
  2. Finds all occurrences indices. 查找所有事件索引。
  3. Prints the last element of combinations[index+1] , in which is what you look for. 打印combinations[index+1]的最后一个元素,这就是您要查找的内容。

Or this: 或这个:

pattern = ["rock", "paper"]
lenpat=len(pattern)
original = ["scissors", "rock", "rock", "paper","rock", "scissors", "rock", "paper", "scissors"]
index=[]
for i in range(len(original)-lenpat):
    if original[i:i+lenpat]==pattern:
        index.append(i)

print original
print pattern
print index

The problem with the answers till now is that their great but their O(n) for each query unfortunately. 到目前为止,答案的问题是它们的效果很好,但不幸的是每个查询的O(n)。 You need a suffix trie to answer the queries in O(k) where k is pattern length. 您需要后缀trie来回答O(k)中的查询,其中k是模式长度。

Here is a good lib for the same https://github.com/ptrus/suffix-trees pip install suffix-trees 这是相同https://github.com/ptrus/suffix-trees的好库pip install suffix-trees

However in your case a bit more processing is left to do.. Since the choices will only be from 'rock', 'papers' and 'scissors' (assuming lizard and spock don't join in later :-P ) normalize and replace them with 'r','p' and 's' 但是,在您的情况下,还需要进行更多处理。.由于只能从“摇滚”中进行选择,因此“纸张”和“剪刀”(假定蜥蜴和史波克犬以后不会加入:-P)可以进行标准化和替换。它们带有“ r”,“ p”和“ s”

Use "".join(new_arr) and your ready to use the github link above. 使用“” .join(new_arr),您就可以使用上面的github链接了。 Drop a comment if you have a prob or want more explanation. 如果您有问题或需要更多解释,请发表评论。

If you have a very long list to browse, recursion might be adapted (and funnier for what it's worth): 如果要浏览的列表很长,则可以调整递归(并且更有趣):

def find_pattern_and_followings(pattern, rest, followings=None,
                                pattern_indexes=None, curr_index=0):
    if len(rest) < len(target):
        return followings, pattern_indexes
    followings = [] if followings is None else followings
    pattern_indexes = [] if pattern_indexes is None else pattern_indexes
    # Check if the first elements match the pattern
    # and move to next elements
    if len(rest) >= len(target):
        if rest[0:len(target)] == target:
            pattern_indexes.append(curr_index)
            if len(rest) > len(target):
                followings.append(rest[len(target)])
            rest = rest[len(target):]
            curr_index += len(target)
        else:
            rest = rest[1:]
            curr_index += 1   
        return(find_pattern_and_followings(pattern, rest, 
                                           followings=followings, 
                                           pattern_indexes=pattern_indexes,
                                           curr_index=curr_index))

Returns: 返回:

(['rock', 'scissors'], [2, 6])

What the function does is browse the list item by item, if the first items of the list match the pattern it stores interesting information. 该功能的作用是逐项浏览列表,如果列表的第一项与存储有趣信息的模式匹配。 Then it removes the already-scanned elements and starts again until the list is too short to contain a pattern. 然后,它将删除已扫描的元素并再次开始,直到列表太短而无法包含模式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM