从数据集中删除不匹配的项目

Question

I have two datasets consisting of lists of nested lists such that each item in the list looks like list1[i]= [a, x, yb] and list2[j] = [c, x, y, d] and where the length of the two lists does not necessarily match. 我有两个由嵌套列表的列表组成的数据集，因此列表中的每个项目看起来像list1[i]= [a, x, yb]和list2[j] = [c, x, y, d]以及长度这两个列表中的不一定匹配。 I'd like to be able to go through the lists, preserve their order, and eliminate any of the sub-lists that do not contain matching x values. 我希望能够浏览列表，保留其顺序，并消除不包含匹配x值的任何子列表。 In the end, I want to get two lists of identical length and where for each index, the x value is the same in corresponding sub lists. 最后，我想获得两个长度相同的列表，并且对于每个索引，相应的子列表中的x值相同。

Right now I have a somewhat messy code that assumes that the set of x values in list2 is a subset of those in list1 (true at the moment) and then proceeds to remove items where the x values don't match. 现在，我有一个有点混乱的代码，它假定list2的x值集合是list1的x值的子集（目前为true），然后继续删除x值不匹配的项。

    len_diff = len(list1) - len(list2)
    if len_diff > 0:
        removed = []
        for (counter, row) in enumerate(list2):
            while list1[counter][1] != list2[counter][1]:
                removed.append(list1.pop(counter))
        new_len_diff = len(list1) - len(list2)
        if new_len_diff < 0:
            raise IndexError('Data sets do not completely overlap')
        else:
            for i in range(new_len_diff):
                removed.append(temp_data.pop())

So basically I'm removing any items that don't x values match until they start matching again and then removing the end of list1 beyond the x values in list2 (raising an exception if I've cut too much out of list1 ). 因此，基本上，我将删除x值不匹配的所有项目，直到它们再次开始匹配，然后删除list2 x值之外的list1的末尾（如果我从list1切得太多，则会引发异常）。

Is there a better way to do this? 有一个更好的方法吗？

I don't necessarily need to relax the assumption that all x values in list2 are in list1 at the moment but it would make this code more useful to me in the future for other data manipulations. 我并不一定需要放松这样的假设，即list2中的所有x值目前都在list1中，但这将使该代码将来对我来说对其他数据操作更有用。 The biggest hole in my code now is that if there is a gap in my list1 data, I'll remove my entire list. 现在，我的代码中最大的漏洞是，如果list1数据中存在间隙，我将删除整个列表。

Answer 1

You should try this: 您应该尝试这样：

list1 = list2 = [x for x in list1 if x[1] in zip(*list2)[1]]

EDIT 编辑

Based on the comments below, the OP adapted this answer to do what was wanted by doing 根据以下评论，OP修改了此答案以完成所需的操作

list1 = [x for x in list1 if x[1] in zip(*list2)[1]]
list2 = [x for x in list2 if x[1] in zip(*list1)[1]]

从数据集中删除不匹配的项目

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-05-22 19:44:01

从数据集中删除不匹配的项目

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-05-22 19:44:01

解决方案1
1 已采纳 2014-05-22 19:44:01