[英]Find matching elements in nested list
我有一個這樣的嵌套列表:
lst = [['one two', 'three', '10'], ['spam eggs', 'spam', '8'],
['two three', 'four', '5'], ['foo bar', 'foo', '7'],
['three four', 'five', '9']]
最后一個因素是一種可能性。 我需要找到的元素,其中一個元素的第二個和第三個單詞與另一個元素的第一個和第二個單詞匹配,例如:
['one two', 'three', '10'] match ['two three', 'four', '5'] match ['three four', 'five', '9']
並制作如下鏈:
one two 10 three 5 four 9 five
我了解第一步必須是元素的tokinization:
lst = ([' '.join(x).split() for x in lst])
for i in lst:
print(i)
所以我得到
['one', 'two', 'three', '10']
['spam', 'eggs', 'spam', '8']
['two', 'three', 'four', '4']
['foo', 'bar', 'foo', '7']
['three', 'four', 'five', '9']
下一步應該是對列表的每個元素進行某種迭代搜索,但是我對這樣的搜索的Python實現有些困惑。 任何幫助,將不勝感激。
我建議通過以下方式使用熊貓:
import pandas as pd
lst = [['one two', 'three', '10'], ['spam eggs', 'spam', '8'],
['two three', 'four', '5'], ['foo bar', 'foo', '7'],
['three four', 'five', '9']]
lst = [' '.join(x).split() for x in lst]
#Create a dataframe and merge using the adequate columns
df = pd.DataFrame(lst)
matchedDF = df.merge(df,how='inner',left_on=[1,2],right_on=[0,1],suffixes=['left','right'])
# remove unneccessary columns
cols=matchedDF.columns.tolist()
matchedDF = matchedDF[cols[2:]]
print(matchedDF)
我得到:
0left 1left 2left 3left 0right 1right 2right 3right
0 one two three 10 two three four 5
1 two three four 5 three four five 9
這也可以:
lst = [['one two', 'three', '10'],['spam eggs', 'spam', '8'], ['two three', 'four', '5'], ['foo bar', 'foo', '7'], ['three four', 'five', '9']]
lst = ([' '.join(x).split() for x in lst])
match, first = [], True
for i in lst:
for j in lst:
if i[0] == j[1] and i[1] == j[2]:
if first:
match.append(j)
first = False
match.append(i)
for i in match:
if i == match[len(match)-1]: print(i)
else: print ("{} match ".format(i), end=' ')
for i in match:
if i == match[0]: print (i[0], i[1], i[3], end=' ')
elif i == match[len(match)-1]: print (i[1], i[3], i[2])
else: print (i[1], i[3], end=' ')
for i in match
循環中for i in match
的第一個輸出:
['one', 'two', 'three', '10'] match ['two', 'three', 'four', '5'] match ['three', 'four', 'five', '9']
第二個:
one two 10 three 5 four 9 five
您可以使用itertools
# import itertools
import itertools
# search for the item after generating a chain
item in itertools.chain.from_iterable(lst)
試試這個:
lst = [['one two', 'three', '10'], ['spam eggs', 'spam', '8'],
['two three', 'four', '5'], ['foo bar', 'foo', '7'],
['three four', 'five', '9']]
lst = [' '.join(x).split() for x in lst]
for i in lst:
print(i)
# ---------------------------------------------------------------
st = set()
for i in [set(x) for x in lst]:
st |= i
print(st)
print(list(st))
輸出:
['one', 'two', 'three', '10']
['spam', 'eggs', 'spam', '8']
['two', 'three', 'four', '5']
['foo', 'bar', 'foo', '7']
['three', 'four', 'five', '9']
{'bar', 'spam', '9', 'one', 'five', 'three', 'two', '8', 'four', '5', 'foo', '10', '7', 'eggs'}
['bar', 'spam', '9', 'one', 'five', 'three', 'two', '8', 'four', '5', 'foo', '10', '7', 'eggs']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.