简体   繁体   English

在相同值的列表中查找差异

[英]Find difference in list with identical values

I need to be able to find differences in list that may have identical values to one another besides two added elements 我需要能够找到列表中的差异,除了两个添加的元素外,这些差异可能具有彼此相同的值

example

a = ['cool task', 'b', 'another task', 'j', 'better task', 'y']
b = ['cool task', 'b', 'a task', 'j', 'another task', 'j', 'better task', 'y']

How my problem is, both 'a task' and 'another task' both are followed by a 'j' 我的问题是如何, 'a task''another task'都跟着'j'

[x for x in b if x not in a]
['a task']

Because both a and b contain 'j' , it is removed from the list. 由于ab都包含'j' ,因此将其从列表中删除。

How would I make so that I end up with 我将如何做才能最终得到

['a task', 'j']

For simple list - what you ask is simply searching for that next item in the list: 对于简单列表 -您要查询的只是在列表中搜索下一个项目:

>>> a = ['cool task', 'b', 'another task', 'j', 'better task', 'y']
>>> b = ['cool task', 'b', 'a task', 'j', 'another task', 'j', 'better task', 'y']
>>> c = [[x, b[b.index(x) + 1]] for x in b if x not in a]
>>> c
[['a task', 'j']]

But I think you are actually aiming at using dictionary or tuples. 但是我认为您实际上的目标是使用字典或元组。

Tuples: 元组:

>>> a = [('cool task', 'b'), ('another task', 'j'), ('better task', 'y')]
>>> b = [('cool task', 'b'), ('a task', 'j'), ('another task', 'j'), ('better task', 'y')]
>>> c = [x for x in b if x not in a]
>>> c
[('a task', 'j')]

Dictionaries: 字典:

>>> a = {'cool task': 'b', 'another task': 'j', 'better task': 'y'}
>>> b = {'cool task': 'b', 'a task': 'j', 'another task': 'j', 'better task': 'y'}
>>> c = [(x, b[x]) for x in b if x not in a]
>>> c
[('a task', 'j')]

You could use the difflib.SequenceMatcher() class to enumerate added, removed and changed entries: 您可以使用difflib.SequenceMatcher()枚举添加,删除和更改的条目:

>>> from difflib import SequenceMatcher
>>> matcher = SequenceMatcher(a=a, b=b)
>>> added = []
>>> for tag, i1, i2, j1, j2 in matcher.get_opcodes():
...     if tag == 'insert':
...         added += b[j1:j2]
...
>>> added
['a task', 'j']

The above only focuses on added entries; 以上仅关注添加的条目; if you need to know about entries that were removed or altered, then there are opcodes for those events too, see the SequenceMatcher.get_opcodes() method documentation . 如果您需要了解已删除或更改的条目,那么这些事件也有操作码,请参见SequenceMatcher.get_opcodes()方法文档

However, if your entries are always paired , then just produce sets with tuples from them (using pair-wise iteration ); 但是,如果您的条目总是成对的 ,那么只需从它们中产生带有元组的集合(使用成对迭代 ); you can then do any set operations on these: 然后,您可以对这些进行任何设置操作:

aset = set(zip(*([iter(a)] * 2)))
bset = set(zip(*([iter(b)] * 2)))
difference = bset - aset

Demo: 演示:

>>> aset = set(zip(*([iter(a)] * 2)))
>>> bset = set(zip(*([iter(b)] * 2)))
>>> aset
{('another task', 'j'), ('cool task', 'b'), ('better task', 'y')}
>>> bset
{('a task', 'j'), ('another task', 'j'), ('cool task', 'b'), ('better task', 'y')}
>>> bset - aset
{('a task', 'j')}

it works as you want: 它可以根据需要工作:

#!/usr/bin/env python
# -*- coding: utf-8 -*-


def difference(a, b):
    a, b = (lambda x, y: (y, x) if len(set(x)) > len(set(y)) else (x, y)) (a, b)
    a_result = list(a)
    b_result = list(b)

    for z in range(len(a)):
        if a[z] in b:
            a_result.remove(a[z])
            b_result.remove(a[z])

    return a_result, b_result 
    # or
    # return a_result if len(set(a_result)) > len(set(b_result)) else b_result


def main():
    a = ['cool task', 'b', 'another task', 'j', 'better task', 'y']
    b = ['cool task', 'b', 'a task', 'j', 'another task', 'j', 'better     task', 'y']
    print(difference(a, b))


if __name__ == "__main__":
    main()

Depending on your purposes, you could possibly use Counter from the collections module : 根据您的目的,您可以在collections模块中使用Counter

>>> from collections import Counter
>>> a = Counter(['cool task', 'b', 'another task', 'j', 'better task', 'y'])
>>> b = Counter(['cool task', 'b', 'a task', 'j', 'another task', 'j', 'better task', 'y'])
>>> b-a
Counter({'j': 1, 'a task': 1})
>>> list((b-a).keys())
['j', 'a task']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM