简体   繁体   English

比较元组列表,根据条件确认子集?

[英]Compare list of tuples, confirm subset based on condition?

Given a word and a list of words, I have to find the list elements/words that can be built using letters (count of letter matters) of the given word. 给定一个单词和一个单词列表,我必须找到可以使用给定单词的字母(字母数量)构建的列表元素/单词。 I have tried to use Counter object from collections and a definition of python 2.7's cmp() function (I'm using 3.6.5). 我尝试使用集合中的Counter对象和python 2.7的cmp()函数的定义(我正在使用3.6.5)。

I have since come to realize this approach seems to be bad practice for such a problem (Earlier, I was trying to use counter object-dictionaries to compare). 从那以后,我开始意识到这种方法似乎对这样的问题是不好的做法(以前,我试图使用反对象字典进行比较)。 The reason my program doesn't work is because the compare_fn relies on '>','<' operations between lists, which give result based on lexicographical order ( referred from here ). 我的程序不起作用的原因是因为compare_fn依赖于列表之间的'>','<'操作,这些操作根据字典顺序给出结果( 从此处引用 )。 So even though 'raven' can be made from 'ravenous', the program below will fail because of the order of char in a sorted list. 因此,即使可以从“ ravenous”中创建“ raven”,但由于有序列表中char的顺序,下面的程序也会失败。

from collections import Counter    
word = 'ravenous'
candidates = ["raven", "mealwheel", "rasputin"]

def count_fn(mystr):
    return sorted(list(Counter(mystr).items()))

def compare_fn (c1,c2):
    return ((c1>c2) - (c1<c2))

list_word =  count_fn(word)
list_candidates = list(map(count_fn,candidates))
cmp_list = [compare_fn(list_word,i) for i in list_candidates]
cmp_list
#[-1, -1, -1]    #should be [1,-1,-1]

So, for below two lists, how can I confirm that list_candidates[0] is a subset of list_word . 因此,对于下面的两个列表,如何确认list_candidates[0]list_word的子集。 Please note that the comparison ('a',1) in list_word against ('a',1) in list_candidates[i] could also be ('a',5) in list_word against ('a',1) in list_candidates[i] ; 请注意,所述比较('a',1)list_word针对('a',1)list_candidates[i]也可('a',5)list_word针对('a',1)list_candidates[i] ; both cases are true. 两种情况都是正确的。

print(list_word)
#[('a', 1), ('e', 1), ('n', 1), ('o', 1), ('r', 1), ('s', 1), ('u', 1), ('v', 1)]
print(list_candidates[0])
#[('a', 1), ('e', 1), ('n', 1), ('r', 1), ('v', 1)]

I think using counters is a good choice. 我认为使用计数器是一个不错的选择。 Don't turn them into lists. 不要把它们变成列表。 I purposely returned [True, False, False] instead of [1, -1, -1], but you can change that easily. 我特意返回了[True,False,False]而不是[1,-1,-1],但是您可以轻松地进行更改。

Moreover: I used a list comprehension instead of map, beacause it is more current in python, but the semantic is the same. 而且:我使用列表理解而不是map,因为它在python中是最新的,但是语义是相同的。

from collections import Counter
word = 'ravenous'
candidates = ["raven", "mealwheel", "rasputin"]

def count_fn(mystr):
    return Counter(mystr)

def compare_fn (c1,c2):
    return all(c1[char] >= count for char, count in c2.items())

counter_word =  count_fn(word)
list_candidates = [count_fn(candidate) for candidate in candidates]
cmp_list = [compare_fn(counter_word, i) for i in list_candidates]
print(cmp_list)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM