[英]Sort and filter a list based on elements from a second list
I have two lists, the first one contains peoples' names, with each person associated with various characters, for example numbers, letters, eg: 我有两个列表,第一个包含人名,每个人都与各种字符相关联,例如数字,字母,例如:
listNameAge = ['alain_90xx', 'fred_10y', 'george_50', 'julia_10l','alain_10_aa', 'fred_90', 'julia_50', 'george_10s', 'alain_50', 'fred_50', 'julia_90']
The second one contains the name of the person: 第二个包含人的姓名:
listName = ['fred', 'julia', 'alain', 'george']
Using the second list, I would like to associate a third list to the first one, such that each name in the first list is associated with its index position in the second one, ie: 使用第二个列表,我想将第三个列表与第一个列表相关联,这样第一个列表中的每个名称都与第二个列表中的索引位置相关联,即:
thirdlist = [2, 0, 3, 1, 2, 0, 1, 3, 2, 0, 1]
The name and characters are separated by an underscore, but the character can be of any sort. 名称和字符由下划线分隔,但字符可以是任何类型。 I could loop over the elements of
listNameAge
, separate the name of the persons from the rest of the characters using a .split('_')
on the string, find which name it is and find its index in listName
using a second loop. 我可以循环遍历
listNameAge
的元素,使用字符串上的.split('_')
将人员名称与其余字符分开,找到它的名称,并使用第二个循环在listName
查找其索引。
I was however wondering if there is a simpler way to do this, ie avoid using loop and use only a comprehension list? 然而,我想知道是否有更简单的方法来做到这一点,即避免使用循环并仅使用理解列表?
While you can do this with a one-liner, I think that, for efficiency, it wold pay to build a dictionary: 虽然你可以使用单行代码来实现这一点,但我认为,为了提高效率,我需要付费才能构建字典:
namePos = dict((name, i) for (i, name) in enumerate(listName))
>>> [namePos[n.split('_')[0]] for n in listNameAge]
[2, 0, 3, 1, 2, 0, 1, 3, 2, 0, 1]
The (expected) running time of this code, is Θ(m + n) where m is the length of the first list, and n the length of the other one. 此代码的(预期)运行时间,是Θ(M + N),其中m是所述第一列表的长度,并且n的另一个的长度。
For this question in specific, I would recommend you use a loop just for clarity. 对于这个具体的问题,我建议你使用一个循环只是为了清楚。 However, if you must use a list comprehension, you can do that essentially the same way:
但是,如果必须使用列表推导,则可以基本上以相同的方式执行此操作:
thirdlist = [listName.index(x[:x.find('_')]) for x in listNameAge]
thirdList = [listName.index(string.split("_")[0]) for string in listNameAge]
它是由listName.index(string.split("_")[0]
组成的列表理解,其中为listNameAge
每个项定义string
string.split("_")[0]
是从开头的字符串第一个下划线的字符串,所以listName.index(string.split("_")[0]
是listName
第一次出现的字符串。
You can take each item in listNameAge
, then split
on '_'
, get the first part of the split, then use index
to find it in the second list. 您可以在
listNameAge
获取每个项目,然后在'_'
上split
,获取拆分的第一部分,然后使用index
在第二个列表中找到它。
>>> [listName.index(i.split('_')[0]) for i in listNameAge]
[2, 0, 3, 1, 2, 0, 1, 3, 2, 0, 1]
You can try this , check whenever the listNameAge
appears in listName
: 您可以尝试这一点,每当
listNameAge
出现在listName
时检查:
for x in listNameAge:
for y in listName:
if y in x:
thirdList.append(listName.index(y))
result : 结果:
[2, 0, 3, 1, 2, 0, 1, 3, 2, 0, 1]
I would strongly recommend against using .index()
as its complexity is O(n)
and makes the overall complexity of this operation O(mn)
where m
and n
are sizes of the lists. 我强烈建议不要使用
.index()
因为它的复杂性是O(n)
并且使得该操作的总体复杂度为O(mn)
,其中m
和n
是列表的大小。
Here's a fast one liner using generators: 这是使用发电机的快速单线程:
map(lambda (x,y): y[x[:x.find('_')]],izip(listNameAge, repeat(dict(izip(listName, count())))))
More readable version would be (as Ami has shown): 更可读的版本(如Ami所示):
nameMap = dict(izip(listName, xrange(len(listName))))
thirdList = map(lambda x: nameMap[x[:x.find('_')]],listNameAge)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.