简体   繁体   English

将列表元素与元组列表中的元组元素匹配

[英]Matching list elements with element of tuple in list of tuples

I have a list containing strings: 我有一个包含字符串的列表:

lst = ['a', 'a', 'b']

where each string is, in fact, a category of a corpus, and I need a list of integers that corresponds to the index of that category. 实际上,每个字符串都是语料库的一个类别,我需要一个与该类别的索引相对应的整数列表。

For this purpose, I built a list of tuples where I have each (unique) category and its index, f.ex: 为此,我建立了一个元组列表,其中有每个(唯一)类别及其索引f.ex:

catlist = [(0, 'a'), (1, 'b')]

I now need to iterate over the first list of strings, and if the element matches any of the second elements of the tuple, return the tuple's first element to an array, like this: 现在,我需要遍历字符串的第一个列表,如果该元素与元组的第二个元素中的任何一个匹配,则将元组的第一个元素返回到数组,如下所示:

[0, 0, 1]

for now I have 现在我有

catindexes = []
for item in lst:
    for i in catlist:
        if cat == catlist[i][i]:
            catindexes.append(i)

but this clearly doesn't work and I'm failing to get to the solution. 但这显然行不通,而且我无法解决问题。 Any tips would be appreciated. 任何提示将不胜感激。

>>> lst = ['a', 'a', 'b']
>>> catlist = [(0, 'a'), (1, 'b')]
>>> catindexes = []
>>> for item in lst:
...     for i in catlist:
...       if i[1] == item:
...         catindexes.append(i[0])
...
>>> catindexes
[0, 0, 1]

During the iteration, i is a direct reference to an element of catlist , not its index. 在迭代过程中, icatlist元素的直接引用,而不是其索引。 I'm not using i to extract an element from lst , the for ... in ... already takes care of that. 我没有使用ilst提取元素, for ... in ...已经解决了这一问题。 As i is a direct reference to a tuple, I can simply extract the relevant fields for matching and appending without the need to mess with the indexing of lst . 由于i是对元组的直接引用,因此我可以简单地提取相关字段以进行匹配和追加,而无需弄乱lst的索引。

You were close, after iterating the inner loop, you should check whether the item from the outer loop is actually equal to the tup[1] (each tup represent (0, 'a') or (1, 'b') for example). 您很接近,在迭代内部循环之后,应检查外部循环中的项是否实际上等于tup[1] (例如,每个tup代表(0, 'a')(1, 'b') )。

if they equal, just append the first element in tup ( tup[0] ) to the result list. 如果它们相等,则只需将tup中的第一个元素( tup[0] )附加到结果列表中。

lst = ['a', 'a', 'b']

catlist = [(0, 'a'), (1, 'b')]

catindexes = []
for item in lst:
    for tup in catlist:
        if item == tup[1]:
            catindexes.append(tup[0])
print (catindexes)

You also can use list comprehension: 您还可以使用列表理解:

catindexes = [tup[0] for item in lst for tup in catlist if tup[1] == item]

You can create a dictionary (we call it d ) from catlist and reverse it. 您可以从catlist创建字典(我们称其为d )并将其反向。 Now, for each element i of lst , what you're looking for is d[i] : 现在,对于lst每个元素i ,您正在寻找的是d[i]

d = {v: k for k, v in catlist}
res = [d[i] for i in lst]

Output: 输出:

>>> lst = ['a', 'a', 'b']
>>> d = {v: k for k, v in catlist}
>>> d
{'a': 0, 'b': 1}
>>>
>>> res = [d[i] for i in lst]
>>> res
[0, 0, 1]

An efficient way for big lists : 大名单的有效方法:

step 1 : build the good dictionary. 步骤1:建立好字典。

d=dict((v,k) for (k,v) in catlist)

step 2 : use it. 第2步:使用它。

[d[k] for k in lst]

This way the execution time will grow like len(lst) + len(catlist) instead of len(lst) x len(catlist) . 这样,执行时间将像len(lst) + len(catlist)一样增长,而不是len(lst) x len(catlist)

I would recommend using a dictionary for your catlist instead. 我建议为您的catlist使用字典 I think it more naturally fits what you are trying to do: 我认为它更自然地适合您要执行的操作:

lst = ['a', 'a', 'b']
catdict = {'a': 0, 'b': 1}
res = [catdict[k] for k in lst]  # res = [0, 0, 1]

Condition defines in if block is not correct. 条件定义块是否不正确。

Try this.. 尝试这个..

lst = ['a', 'a', 'b']
catlist = [(0, 'a'), (1, 'b')]


catindexes = []
for item in lst:
    for i in catlist:
        if i[1]==item:
            catindexes.append(i[0]);

print catindexes

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM