![](/img/trans.png)
[英]How to convert list of lists to a set in python so I can compare to other sets?
[英]How to compare a list of lists/sets in python?
比較 2 個列表/集和 output 差異的最簡單方法是什么? 是否有任何內置函數可以幫助我比較嵌套列表/集合?
輸入:
First_list = [['Test.doc', '1a1a1a', 1111],
['Test2.doc', '2b2b2b', 2222],
['Test3.doc', '3c3c3c', 3333]
]
Secnd_list = [['Test.doc', '1a1a1a', 1111],
['Test2.doc', '2b2b2b', 2222],
['Test3.doc', '8p8p8p', 9999],
['Test4.doc', '4d4d4d', 4444]]
預期 Output:
Differences = [['Test3.doc', '3c3c3c', 3333],
['Test3.doc', '8p8p8p', 9999],
['Test4.doc', '4d4d4d', 4444]]
所以你想要兩個項目列表之間的區別。
first_list = [['Test.doc', '1a1a1a', 1111],
['Test2.doc', '2b2b2b', 2222],
['Test3.doc', '3c3c3c', 3333]]
secnd_list = [['Test.doc', '1a1a1a', 1111],
['Test2.doc', '2b2b2b', 2222],
['Test3.doc', '8p8p8p', 9999],
['Test4.doc', '4d4d4d', 4444]]
首先,我將每個列表列表轉換為元組列表,因此元組是可散列的(列表不是),因此您可以將元組列表轉換為一組元組:
first_tuple_list = [tuple(lst) for lst in first_list]
secnd_tuple_list = [tuple(lst) for lst in secnd_list]
然后你可以制作套裝:
first_set = set(first_tuple_list)
secnd_set = set(secnd_tuple_list)
編輯(由 sdolan 建議):您可以為單行中的每個列表完成最后兩個步驟:
first_set = set(map(tuple, first_list))
secnd_set = set(map(tuple, secnd_list))
注意: map
是一個函數式編程命令,它將第一個參數(在本例中為tuple
函數)中的 function 應用於第二個參數中的每個項目(在本例中為列表列表)。
並找到集合之間的對稱差:
>>> first_set.symmetric_difference(secnd_set)
set([('Test3.doc', '3c3c3c', 3333),
('Test3.doc', '8p8p8p', 9999),
('Test4.doc', '4d4d4d', 4444)])
注意first_set ^ secnd_set
等價於symmetric_difference
。
此外,如果您不想使用集合(例如,使用 python 2.2),這很簡單。 例如,使用列表推導:
>>> [x for x in first_list if x not in secnd_list] + [x for x in secnd_list if x not in first_list]
[['Test3.doc', '3c3c3c', 3333],
['Test3.doc', '8p8p8p', 9999],
['Test4.doc', '4d4d4d', 4444]]
或使用功能filter
命令和lambda
功能。 (你必須測試兩種方式並結合起來)。
>>> filter(lambda x: x not in secnd_list, first_list) + filter(lambda x: x not in first_list, secnd_list)
[['Test3.doc', '3c3c3c', 3333],
['Test3.doc', '8p8p8p', 9999],
['Test4.doc', '4d4d4d', 4444]]
不確定是否有一個不錯的 function ,但是“手動”的方法並不難:
differences = []
for list in firstList:
if list not in secondList:
differences.append(list)
>>> First_list = [['Test.doc', '1a1a1a', '1111'], ['Test2.doc', '2b2b2b', '2222'], ['Test3.doc', '3c3c3c', '3333']]
>>> Secnd_list = [['Test.doc', '1a1a1a', '1111'], ['Test2.doc', '2b2b2b', '2222'], ['Test3.doc', '3c3c3c', '3333'], ['Test4.doc', '4d4d4d', '4444']]
>>> z = [tuple(y) for y in First_list]
>>> z
[('Test.doc', '1a1a1a', '1111'), ('Test2.doc', '2b2b2b', '2222'), ('Test3.doc', '3c3c3c', '3333')]
>>> x = [tuple(y) for y in Secnd_list]
>>> x
[('Test.doc', '1a1a1a', '1111'), ('Test2.doc', '2b2b2b', '2222'), ('Test3.doc', '3c3c3c', '3333'), ('Test4.doc', '4d4d4d', '4444')]
>>> set(x) - set(z)
set([('Test4.doc', '4d4d4d', '4444')])
老問題,但這是我用來返回兩個列表中都沒有的唯一元素的解決方案。
我用它來比較從數據庫返回的值和目錄爬蟲 package 生成的值。 我不喜歡我找到的其他解決方案,因為它們中的許多無法動態處理平面列表和嵌套列表。
def differentiate(x, y):
"""
Retrieve a unique of list of elements that do not exist in both x and y.
Capable of parsing one-dimensional (flat) and two-dimensional (lists of lists) lists.
:param x: list #1
:param y: list #2
:return: list of unique values
"""
# Validate both lists, confirm either are empty
if len(x) == 0 and len(y) > 0:
return y # All y values are unique if x is empty
elif len(y) == 0 and len(x) > 0:
return x # All x values are unique if y is empty
# Get the input type to convert back to before return
try:
input_type = type(x[0])
except IndexError:
input_type = type(y[0])
# Dealing with a 2D dataset (list of lists)
try:
# Immutable and Unique - Convert list of tuples into set of tuples
first_set = set(map(tuple, x))
secnd_set = set(map(tuple, y))
# Dealing with a 1D dataset (list of items)
except TypeError:
# Unique values only
first_set = set(x)
secnd_set = set(y)
# Determine which list is longest
longest = first_set if len(first_set) > len(secnd_set) else secnd_set
shortest = secnd_set if len(first_set) > len(secnd_set) else first_set
# Generate set of non-shared values and return list of values in original type
return [input_type(i) for i in {i for i in longest if i not in shortest}]
我想您必須將列表轉換為集合:
>>> a = {('a', 'b'), ('c', 'd'), ('e', 'f')}
>>> b = {('a', 'b'), ('h', 'g')}
>>> a.symmetric_difference(b)
{('e', 'f'), ('h', 'g'), ('c', 'd')}
通過使用集合推導,您可以使其成為單行。 如果你想:
得到一組元組,然后:
Differences = {tuple(i) for i in First_list} ^ {tuple(i) for i in Secnd_list}
或獲取元組列表,然后:
Differences = list({tuple(i) for i in First_list} ^ {tuple(i) for i in Secnd_list})
或獲取列表列表(如果您真的想要),然后:
Differences = [list(j) for j in {tuple(i) for i in First_list} ^ {tuple(i) for i in Secnd_list}]
PS:我在這里讀到: https://stackoverflow.com/a/10973817/4900095 map() function 不是pythonic的做事方式。
http://docs.python.org/library/difflib.html是一個很好的起點。
如果您將它遞歸地應用於增量,您應該能夠處理嵌套數據結構。 但這需要一些工作。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.