简体   繁体   English

根据第二个列表中的元素对列表进行排序和筛选

[英]Sort and filter a list based on elements from a second list

I have two lists, the first one contains peoples' names, with each person associated with various characters, for example numbers, letters, eg: 我有两个列表,第一个包含人名,每个人都与各种字符相关联,例如数字,字母,例如:

listNameAge = ['alain_90xx', 'fred_10y', 'george_50', 'julia_10l','alain_10_aa', 'fred_90', 'julia_50', 'george_10s', 'alain_50', 'fred_50', 'julia_90']

The second one contains the name of the person: 第二个包含人的姓名:

listName = ['fred', 'julia', 'alain', 'george']

Using the second list, I would like to associate a third list to the first one, such that each name in the first list is associated with its index position in the second one, ie: 使用第二个列表,我想将第三个列表与第一个列表相关联,这样第一个列表中的每个名称都与第二个列表中的索引位置相关联,即:

thirdlist = [2, 0, 3, 1, 2, 0, 1, 3, 2, 0, 1]

The name and characters are separated by an underscore, but the character can be of any sort. 名称和字符由下划线分隔,但字符可以是任何类型。 I could loop over the elements of listNameAge , separate the name of the persons from the rest of the characters using a .split('_') on the string, find which name it is and find its index in listName using a second loop. 我可以循环遍历listNameAge的元素,使用字符串上的.split('_')将人员名称与其余字符分开,找到它的名称,并使用第二个循环在listName查找其索引。

I was however wondering if there is a simpler way to do this, ie avoid using loop and use only a comprehension list? 然而,我想知道是否有更简单的方法来做到这一点,即避免使用循环并仅使用理解列表?

While you can do this with a one-liner, I think that, for efficiency, it wold pay to build a dictionary: 虽然你可以使用单行代码来实现这一点,但我认为,为了提高效率,我需要付费才能构建字典:

namePos = dict((name, i) for (i, name) in enumerate(listName))
>>> [namePos[n.split('_')[0]] for n in listNameAge]
[2, 0, 3, 1, 2, 0, 1, 3, 2, 0, 1]

The (expected) running time of this code, is Θ(m + n) where m is the length of the first list, and n the length of the other one. 代码的(预期)运行时间,是Θ(M + N),其中m是所述第一列表的长度,并且n的另一个的长度。

For this question in specific, I would recommend you use a loop just for clarity. 对于这个具体的问题,我建议你使用一个循环只是为了清楚。 However, if you must use a list comprehension, you can do that essentially the same way: 但是,如果必须使用列表推导,则可以基本上以相同的方式执行此操作:

thirdlist = [listName.index(x[:x.find('_')]) for x in listNameAge]
thirdList = [listName.index(string.split("_")[0]) for string in listNameAge]

它是由listName.index(string.split("_")[0]组成的列表理解,其中为listNameAge每个项定义string string.split("_")[0]是从开头的字符串第一个下划线的字符串,所以listName.index(string.split("_")[0]listName第一次出现的字符串。

You can take each item in listNameAge , then split on '_' , get the first part of the split, then use index to find it in the second list. 您可以在listNameAge获取每个项目,然后在'_'split ,获取拆分的第一部分,然后使用index在第二个列表中找到它。

>>> [listName.index(i.split('_')[0]) for i in listNameAge]
[2, 0, 3, 1, 2, 0, 1, 3, 2, 0, 1]

You can try this , check whenever the listNameAge appears in listName : 您可以尝试这一点,每当listNameAge出现在listName时检查:

for x in listNameAge:
    for y in listName:
        if y in x:
            thirdList.append(listName.index(y))

result : 结果:

[2, 0, 3, 1, 2, 0, 1, 3, 2, 0, 1]

I would strongly recommend against using .index() as its complexity is O(n) and makes the overall complexity of this operation O(mn) where m and n are sizes of the lists. 我强烈建议不要使用.index()因为它的复杂性是O(n)并且使得该操作的总体复杂度为O(mn) ,其中mn是列表的大小。

Here's a fast one liner using generators: 这是使用发电机的快速单线程:

map(lambda (x,y): y[x[:x.find('_')]],izip(listNameAge, repeat(dict(izip(listName, count())))))

More readable version would be (as Ami has shown): 更可读的版本(如Ami所示):

nameMap = dict(izip(listName, xrange(len(listName))))
thirdList = map(lambda x: nameMap[x[:x.find('_')]],listNameAge)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM