简体   繁体   English

如何计算Python中两个lists的差值

[英]How to calculate the difference between two lists of lists in Python

I have two lists like the following:我有如下两个列表:

A = [[1, 2, 3], [1, 2, 4], [4, 5, 6]]

and

B = [[1, 2, 3], [1, 2, 6], [4, 5, 6], [4, 3, 6]]

And I wish to calculate the difference, which is equal to the following:我希望计算差值,它等于以下内容:

A - B =[[1, 2, 4]]

In other words, I want to treat A and B as a set of lists (all of the sample size, in this example it is 3) and find the difference (ie, remove all lists in B, which are also in A and return the rest.).换句话说,我想将 A 和 B 视为一组列表(所有样本大小,在本例中为 3)并找出差异(即删除 B 中的所有列表,这些列表也在 A 中并返回rest。)。

Is there a faster way than using multiple for loops for this?有没有比为此使用多个 for 循环更快的方法?

Simple list comprehension will do the trick:简单的列表理解就可以解决问题:

[a for a in A if a not in B]

output: output:

[[1, 2, 4]]

If you convert the second list to a set first, then membership tests are asymptotically faster;如果先将第二个列表转换为集合,则成员资格测试会渐近地更快; the downside is you have to convert the rows to tuples so that they can be in a set.缺点是您必须将行转换为元组,以便它们可以在一个集合中。 (Consider having the rows as tuples instead of lists in the first place.) (首先考虑将行作为元组而不是列表。)

def list_of_lists_subtract(a, b):
    b_set = {tuple(row) for row in b}
    return [row for row in a if tuple(row) not in b_set]

Note that "asymptotically faster" only means this should be faster for large inputs;请注意,“渐近更快”仅意味着对于大输入应该更快; the simpler version will likely be faster for small inputs.对于小输入,更简单的版本可能会更快。 If performance is critical then it's up to you to benchmark the alternatives on realistic data.如果性能至关重要,那么由您根据实际数据对备选方案进行基准测试。

You can try this.你可以试试这个。

  • Convert the first list of lists to a set of tuples S1将列表的第一个列表转换为一组元组 S1
  • Convert the second list of lists to a set of tuples S2将列表的第二个列表转换为一组元组 S2
  • Use the difference method or simply S1 - S2 to get the set of tuples that are present in S1 but not in S2使用差异方法或简单地使用 S1 - S2 来获取 S1 中存在但 S2 中不存在的元组集
  • Convert the result obtained to the desired format (in this case, a list of lists).将获得的结果转换为所需的格式(在本例中为列表列表)。
# (Untested)

A = [[1, 2, 3], [1, 2, 4], [4, 5, 6]]
B = [[1, 2, 3], [1, 2, 6], [4, 5, 6], [4, 3, 6]]

set_A = set([tuple(item) for item in A])
set_B = set([tuple(item) for item in B])

difference_set = set_A - set_B

difference_list = [list(item) for item in sorted(difference_set)]

print(difference_list)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM