简体   繁体   English

筛选包含某些项目的元组的元组列表python

[英]Filter list of tuples for tuples that contain certain items python

I have a list of tuples like so: 我有一个像这样的元组列表:

a = [('1', '2', '5', '5', 'w', 'w', 'w', 'w'),
     ('1', '3', '5', '5', 'w', 'w', 'w', 'w'),
     ('1', '3', '4', '5', 'w', 'w', 'w', 'w'),
     ('1', '4', '4', '4', 'w', 'w', 'w', 'w'),
     ('1', '5', '5', '5', 'w', 'w', 'w', 'w')]

I want to be able to filter out the tuples that contain certain items. 我希望能够过滤出包含某些项目的元组。 For example, I want to find all the tuples that contain '5', '5', 'w', 'w', 'w', 'w' specifically and place them in a list. 例如,我要查找所有包含'5', '5', 'w', 'w', 'w', 'w' ,并将它们放在列表中。

filter_for = ['5', '5', 'w', 'w', 'w', 'w']

Expected result would be: 预期结果将是:

result =  [('1', '2', '5', '5', 'w', 'w', 'w', 'w'),
           ('1', '3', '5', '5', 'w', 'w', 'w', 'w')]

filter_for will have a varying length of 1 to 7 so I using and is not going to be ideal. filter_for长度将在1到7之间变化,因此我使用and不是理想的选择。

I've tried using 我试过使用

[i for i in a if all(j in filtered_for for j in a)]

but that doesn't work. 但这不起作用。

EDIT: If ('1', '5', '5', '5', 'w', 'w', 'w', 'w') was also in the list I wouldn't want that tuple to be found. 编辑:如果('1', '5', '5', '5', 'w', 'w', 'w', 'w')也位于列表中,我不希望该元组成为找到了。 I guess I didn't specify this as all working solutions below would return this tuple as well. 我想我没有指定这个,因为下面所有可行的解决方案也会返回这个元组。

If I understand your requirements correctly, this should return the expected results. 如果我正确理解您的要求,这应该返回预期的结果。 Here we convert the lists to strings, and use in to check for membership. 在这里,我们列表转换为字符串,并使用in检查会员。

>>> a = [('1', '2', '5', '5', 'w', 'w', 'w', 'w'),
 ('1', '3', '5', '5', 'w', 'w', 'w', 'w'),
 ('1', '3', '4', '5', 'w', 'w', 'w', 'w'),
 ('1', '4', '4', '4', 'w', 'w', 'w', 'w')]
>>> filter_for = ''.join(['5', '5', 'w', 'w', 'w', 'w'])
>>> print [tup for tup in a if filter_for in ''.join(tup)]
[('1','2','5','5','w','w','w','w'), ('1','3','5','5','w','w','w','w')]

The below code has been updated to match exact sub-lists in the list of tuples. 下面的代码已更新,以匹配元组列表中的确切子列表。 Instead of pattern matching like in the example above, we take a far different approach here. 不同于上面的示例中的模式匹配 ,我们在这里采用了截然不同的方法。

We start off by finding the head and tail of the filter list. 我们首先查找过滤器列表的headtail We then find the the indices of where the head and tail occur in tup ( we must reverse tup to find the tail_index , as index returns only the first element matched ). 然后,我们发现,其中的指数headtail发生在tup我们必须扭转 tup 找到 tail_index ,由于 index 仅返回匹配的第一个元素 )。 Using our indices pair, we can then slice that sublist spanning the distance between head and tail . 然后,使用索引对,我们可以对该子列表进行切片以覆盖headtail之间的距离。 If this sublist matches the filter, then we know that only that range exists in the search tuple. 如果此子列表过滤器匹配 ,那么我们知道搜索元组中存在该范围。

def match_list(filter_list, l):
    results = []
    filter_for = tuple(filter_list)
    head = filter_for[0]
    tail = filter_for[-1]

    for tup in l:
        reverse_tup = tup[::-1]
        if head and tail in tup:
            try:
                head_index = tup.index(head)
                index_key = reverse_tup.index(tail)
                tail_index = -index_key if index_key else None
                if tup[head_index:tail_index] == filter_for:
                    results.append(tup)  # Prints out condition-satisfied tuples.
            except ValueError:
                continue
    return results

Sample output 样品输出

 >>> a = [('1', '2', '5', '5', 'w', 'w', 'w', 'w'),
 ('1', '3', '5', '5', 'w', 'w', 'w', 'w'),
 ('1', '3', '4', '5', 'w', 'w', 'w', 'w'),
 ('1', '4', '4', '4', 'w', 'w', 'w', 'w'),
 ('1', '5', '5', '5', 'w', 'w', 'w', 'w')]  # <- Does not match!
 >>> filter_for = ['5', '5', 'w', 'w', 'w', 'w']
 >>> print match_list(filter_for, a)
 [('1','2','5','5','w','w','w','w'), ('1','3','5','5','w','w','w','w')]  

I'm not sure If I get the point what you're trying. 我不确定是否要指出您要尝试的内容。 But I would do it as following: 但我会按照以下方式进行操作:

>>>[i for i in a if "".join(filter_for) in "".join(i)]
[('1', '2', '5', '5', 'w', 'w', 'w', 'w'), ('1', '3', '5', '5', 'w', 'w', 'w', 'w')]

Did you mean this 你是这个意思吗

[i for i in a if all([j in i for j in filter_for])]

instead of your line? 而不是你的线?

[i for i in a if all(j in filter_for for j in a)]

This code seems to work, it tests every list by dividing them in several lists of the same length as filter_for 此代码似乎有效,它通过将每个列表划分为与filter_for相同长度的几个列表来测试每个列表

Edit : I tried to add some excluded patterns after your edit 编辑编辑后,我尝试添加一些排除的模式

a = [('1', '2', '5', '5', 'w', 'w', 'w', 'w'),
     ('1', '3', '5', '5', 'w', 'w', 'w', 'w'),
     ('1', '3', '4', '5', 'w', 'w', 'w', 'w'),
     ('1', '4', '4', '4', 'w', 'w', 'w', 'w'),
     ('1', '5', '5', '5', 'w', 'w', 'w', 'w')]

filter_for = ['5', '5', 'w', 'w', 'w', 'w']
excluded = [('1', '5', '5', '5', 'w', 'w', 'w', 'w')]

# add a padding key to excluded patterns
for x in range(len(excluded)):
    value = excluded[x]
    excl = {'value': value}

    for i in range(len(value) - len(filter_for) + 1):
        if list(value[i:i+len(filter_for)]) == list(filter_for):
            excl['padding'] = (i, len(value) - i - len(filter_for))

    excluded[x] = excl


def isexcluded(lst, i):
    # check if the lst is excluded by one of the `excluded` lists
    for excl in excluded:
        start_padding, end_padding = excl['padding']

        # get start and end indexes
        start = max(i-start_padding, 0)
        end = min(i + len(excl['value']) + end_padding, len(lst))

        if list(lst[start:end]) == list(excl['value']):
            return True

    return False


def get_lists(lists, length, excluded):
    for lst in lists:
        # get all the 'sublist', parts of the list that are of the same
        # length as filter_for
        for i in range(len(lst)-length+1):
            tests = [list(lst[i:i+length]) == list(filter_for),
                     not isexcluded(lst, i)]

            if all(tests):
                yield lst

result = list(get_lists(a, len(filter_for), excluded))

print(result)  # python 2: print result

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM