在嵌套列表中查找匹配的元素

Question

I have a nested list like this: 我有一个这样的嵌套列表：

lst = [['one two', 'three', '10'], ['spam eggs', 'spam', '8'],
       ['two three', 'four', '5'], ['foo bar', 'foo', '7'],
       ['three four', 'five', '9']]

The last element is a kind of probability. 最后一个因素是一种可能性。 What I need is to find elements, where second and third words of one element match first and second word of another, for example: 我需要找到的元素，其中一个元素的第二个和第三个单词与另一个元素的第一个和第二个单词匹配，例如：

['one two', 'three', '10'] match ['two three', 'four', '5'] match  ['three four', 'five', '9']

And make chains like: 并制作如下链：

one two 10 three 5 four 9 five

I understand that first step must be tokinization of elements: 我了解第一步必须是元素的tokinization：

lst = ([' '.join(x).split() for x in lst])
for i in lst: 
    print(i)

So I get 所以我得到

['one', 'two', 'three', '10']
['spam', 'eggs', 'spam', '8']
['two', 'three', 'four', '4']
['foo', 'bar', 'foo', '7']
['three', 'four', 'five', '9']

Next step should be some kind of iterative search over each element of the list, but I am a bit stuck with Python realization of such search. 下一步应该是对列表的每个元素进行某种迭代搜索，但是我对这样的搜索的Python实现有些困惑。 Any help would be appreciated. 任何帮助，将不胜感激。

Answer 1

I would suggest using pandas in the following way: 我建议通过以下方式使用熊猫：

import pandas as pd

lst = [['one two', 'three', '10'], ['spam eggs', 'spam', '8'],
   ['two three', 'four', '5'], ['foo bar', 'foo', '7'],
   ['three four', 'five', '9']]

lst = [' '.join(x).split() for x in lst]

#Create a dataframe and merge using the adequate columns

df = pd.DataFrame(lst)
matchedDF = df.merge(df,how='inner',left_on=[1,2],right_on=[0,1],suffixes=['left','right'])

# remove unneccessary columns
cols=matchedDF.columns.tolist()

matchedDF = matchedDF[cols[2:]]

print(matchedDF)

I get: 我得到：

    0left  1left  2left 3left 0right 1right 2right 3right
0   one    two  three    10    two  three   four      5
1   two  three   four     5  three   four   five      9

Answer 2

This works also: 这也可以：

lst = [['one two', 'three', '10'],['spam eggs', 'spam', '8'], ['two three', 'four', '5'], ['foo bar', 'foo', '7'], ['three four', 'five', '9']] 
lst = ([' '.join(x).split() for x in lst])

match, first = [], True
for i in lst:
    for j in lst:
        if i[0] == j[1] and i[1] == j[2]:
            if first:
                match.append(j)
                first = False
            match.append(i)

for i in match:
    if i == match[len(match)-1]: print(i)
    else: print ("{} match ".format(i), end=' ')

for i in match:
    if i == match[0]: print (i[0], i[1], i[3], end=' ')
    elif i == match[len(match)-1]: print (i[1], i[3], i[2])
    else: print (i[1], i[3], end=' ')

Where the first for i in match loop outputs: for i in match循环中for i in match的第一个输出：

['one', 'two', 'three', '10'] match  ['two', 'three', 'four', '5'] match ['three', 'four', 'five', '9']

And the second: 第二个：

one two 10 three 5 four 9 five

Answer 3

You can use itertools 您可以使用itertools

# import itertools
import itertools
# search for the item after generating a chain
item in itertools.chain.from_iterable(lst)

Answer 4

Try this one: 试试这个：

lst = [['one two', 'three', '10'], ['spam eggs', 'spam', '8'],
       ['two three', 'four', '5'], ['foo bar', 'foo', '7'],
       ['three four', 'five', '9']]

lst = [' '.join(x).split() for x in lst]
for i in lst: 
    print(i)

# ---------------------------------------------------------------

st = set()
for i in [set(x) for x in lst]:
    st |= i

print(st)
print(list(st))

Output: 输出：

['one', 'two', 'three', '10']
['spam', 'eggs', 'spam', '8']
['two', 'three', 'four', '5']
['foo', 'bar', 'foo', '7']
['three', 'four', 'five', '9']
{'bar', 'spam', '9', 'one', 'five', 'three', 'two', '8', 'four', '5', 'foo', '10', '7', 'eggs'}
['bar', 'spam', '9', 'one', 'five', 'three', 'two', '8', 'four', '5', 'foo', '10', '7', 'eggs']

在嵌套列表中查找匹配的元素

问题描述

4 个解决方案

解决方案1
1 2018-05-10 08:29:06

解决方案2
1 已采纳 2018-05-10 09:39:51

解决方案3
0 2018-05-10 07:56:00

解决方案4
0 2018-05-10 07:58:03

在嵌套列表中查找匹配的元素

问题描述

4 个解决方案

解决方案1 1 2018-05-10 08:29:06

解决方案2 1 已采纳 2018-05-10 09:39:51

解决方案3 0 2018-05-10 07:56:00

解决方案4 0 2018-05-10 07:58:03

解决方案1
1 2018-05-10 08:29:06

解决方案2
1 已采纳 2018-05-10 09:39:51

解决方案3
0 2018-05-10 07:56:00

解决方案4
0 2018-05-10 07:58:03