Finding the diff of two lists of strings

Question

I'm trying to find a diff (longest common subsequences) between two lists of strings. I'm guessing difflib could be useful here, but difflib.ndiff annotates the output with - , + , etc. For instance

from difflib import ndiff
t1 = 'one 1\ntwo 2\nthree 3'.splitlines()
t2 = 'one 1\ntwo 29\nthree 3'.splitlines()
d = list(ndiff(t1, t2    )); print d;

['  one 1', '- two 2', '+ two 29', '?      +\n', '  three 3']

Is tokenising and removing the letter-codes in the output the right way? Is this the proper Pythonic way of diffing lists?

Answer 1

If all you want is the difference of first list from second, you can convert them to set and take set difference using - operator.

Example -

>>> l1 = [1,2,3,4,5]
>>> l2 = [4,5,6,7,8]
>>> print(list(set(l1) - set(l2)))
[1, 2, 3]

Answer 2

By List comprehension:

In [16]: l1 = ['a', 'b', 'c', 'd']

In [17]: l2 = ['a', 'x', 'y', 'c']

In [18]: l1_l2 = [ii for ii in l1 if ii not in l2]

In [19]: l1_l2
Out[19]: ['b', 'd']

In [20]: l2_l1 = [ii for ii in l2 if ii not in l1]

In [21]: l2_l1 
Out[21]: ['x', 'y']

In [22]:

Finding the diff of two lists of strings

Question

2 answers

solution1
0 2015-06-23 08:29:14

solution2
0 2015-06-23 08:38:41

Finding the diff of two lists of strings

Question

2 answers

solution1 0 2015-06-23 08:29:14

solution2 0 2015-06-23 08:38:41

solution1
0 2015-06-23 08:29:14

solution2
0 2015-06-23 08:38:41