简体   繁体   English

如何在列表中添加元素的值,这些元素不能作为该字典的另一个键重复?

[英]How to add elements in list which is value of dictionary and those elements not be repeated as another keys of that dictionary?

Suppose I have one list which contains anagram strings. 假设我有一个包含anagram字符串的列表。 For example, 例如,

anList = ['aba','baa','aab','cat','tac','act','sos','oss']

And I want to construct a dictionary which contains element of that list as key and anagram strings of that element will be values of that key as a list, Also elements which will be added into list are not repeated as another key of that dictionary. 我想构建一个字典,其中包含该列表的元素作为键,该元素的字符串字符串将是该键的值作为列表。此外,将添加到列表中的元素不会重复作为该字典的另一个键。 For example, if 'baa' has added to list, which list is value of key 'aba', then 'baa' can not be added as key further. 例如,如果'baa'已添加到列表中,哪个列表是键'aba'的值,则'baa'不能再作为键添加。 Output dictionary should be look like, 输出字典应该看起来像,

anDict = {'aba' : ['baa','aab'],'cat' : ['tac','act'],'sos' : ['oss']}

I have tried with many approaches, but problem is that added elements in list are again add as key of dictionary. 我尝试了很多方法,但问题是列表中添加的元素再次添加为字典的键。

How can I done it? 我该怎么办?

You can group your words by the count of letters using the Counter object: 您可以使用Counter对象按字母数量对单词进行分组:

from collections import Counter
from itertools import groupby

sorted list = sorted(anList, key=Counter)
groups = [list(y) for x, y in groupby(sortedList, key=Counter)]
#[['aba', 'baa', 'aab'], ['cat', 'tac', 'act'], ['sos', 'oss']]

Now, convert the list of lists of anagrams into a dictionary: 现在,将字谜列表的列表转换为字典:

{words[0]: words[1:] for words in groups}
#{'aba': ['baa', 'aab'], 'cat': ['tac', 'act'], 'sos': ['oss']}

Here combining both the order of occurrence with the possibility of them not being grouped together: 这里结合了发生的顺序和它们不被组合在一起的可能性:

anagram_list = ['cat','aba','baa','aab','tac','sos','oss','act']

first_anagrams = {}
anagram_dict = {}

for word in anagram_list:
    sorted_word = ''.join(sorted(word))
    if sorted_word in first_anagrams:
        anagram_dict[first_anagrams[sorted_word]].append(word)
    else:
        first_anagrams[sorted_word] = word
        anagram_dict[word] = []

print(anagram_dict)

The output is 输出是

{'aba': ['baa', 'aab'], 'sos': ['oss'], 'cat': ['tac', 'act']}

where the key is always the first anagram in order of occurrence, and the algorithm is strictly O(n) for n words of neglible length. 其中键始终是出现顺序的第一个anagram,并且对于n可忽略长度的单词,算法严格为O(n)


Should you want all anagrams in the list including the first one, it becomes much easier: 如果您想要列表中的所有字谜(包括第一个字谜),它会变得更容易:

anagram_list = ['cat','aba','baa','aab','tac','sos','oss','act']

first_anagrams = {}
anagram_dict = defaultdict(list)

for word in anagram_list:
    anagram_dict[first_anagrams.setdefault(''.join(sorted(word)), word)].append(word)

The result is 结果是

defaultdict(<type 'list'>, 
    {'aba': ['aba', 'baa', 'aab'], 'sos': ['sos', 'oss'], 'cat': ['cat', 'tac', 'act']})

The answers from @DYZ and @AnttiHaapala handle the expected output posted in the question much better than this one. 来自@DYZ@AnttiHaapala的答案处理问题中发布的预期输出要比这个好得多。

Following is an approach that comes with some caveats using collections.defaultdict . 以下是使用collections.defaultdict一些注意事项。 Sort each list element to compare it to the anagram key and append any anagrams that are not the same as the key. 对每个列表元素进行排序,将其与anagram键进行比较,并附加任何与键不同的字谜。

from collections import defaultdict

anagrams = ['aba','baa','aab','cat','tac','act','sos','oss']

d = defaultdict(list)
for a in anagrams:
    key = ''.join(sorted(a))
    if key != a:
        d[key].append(a)

print(d)
# {'aab': ['aba', 'baa'], 'act': ['cat', 'tac'], 'oss': ['sos']}

Caveats: 注意事项:

  • always uses the ascending sorted version of the anagram as the dict key, which is not an exact match for the example output in the question 始终使用anagram的升序排序版本作为dict键,这与问题中的示例输出不完全匹配
  • if the ascending sorted version of the anagram is not in the list, this approach will add a previously non-existent anagram as the dict key 如果anagram的升序排序版本不在列表中,则此方法将添加先前不存在的anagram作为dict键

You can use the function groupby() on a presorted list. 您可以在预先排序的列表中使用函数groupby() The function sorted (or Counter ) can be used as the key for sorting and grouping: sorted (或Counter )功能可用作排序和分组的键:

from itertools import groupby

anList = ['aba', 'baa', 'aab', 'cat', 'tac', 'act', 'sos', 'oss']

{k: v for _, (k, *v) in groupby(sorted(anList, key=sorted), key=sorted)}
# {'aba': ['baa', 'aab'], 'cat': ['tac', 'act'], 'sos': ['oss']}

Here is slow, but working code: 这是慢,但工作代码:

anList = ['aba', 'baa', 'aab', 'cat', 'tac', 'act', 'sos', 'oss']
anDict = {}
for i in anList:
    in_dict = False
    for j in anDict.keys():
        if sorted(i) == sorted(j):
            in_dict = True
            anDict[j].append(i)
            break
    if not in_dict:
        anDict[i] = []

You may use else with a for loop to achieve this: 您可以使用带有for循环的else来实现此目的:

anList = ['aba','baa','aab','cat','tac','act','sos','oss']
anDict = dict()

for k in anList:
        for ok in anDict:
            if (ok == k): break
            if (sorted(ok) == sorted(k)):
                anDict[ok].append(k)
                break
        else:
            anDict[k] = []

print(anDict)
# {'aba': ['baa', 'aab'], 'cat': ['tac', 'act'], 'sos': ['oss']}

A simple version without itertools. 没有itertools的简单版本。

Create a multimap sorted string -> [anagram string] : 创建一个多图sorted string -> [anagram string]

>>> L = ['aba', 'baa', 'aab', 'cat', 'tac', 'act', 'sos', 'oss']
>>> d = {}
>>> for v in L:
...     d.setdefault("".join(sorted(v)), []).append(v)
...
>>> d
{'aab': ['aba', 'baa', 'aab'], 'act': ['cat', 'tac', 'act'], 'oss': ['sos', 'oss']}

Now you've grouped the anagrams, use the first values as key of the return dict: 现在你已经对anagrams进行了分组,使用第一个值作为返回字典的键:

>>> {v[0]:v[1:] for v in d.values()}
{'aba': ['baa', 'aab'], 'cat': ['tac', 'act'], 'sos': ['oss']}
anList = ['aba', 'baa', 'aab', 'cat', 'tac', 'act', 'sos', 'oss']

anDict = {}
for word in anList:
    sorted_word = ''.join(sorted(word))
    found_key = [key  for key in anDict.keys() if sorted_word  == ''.join(sorted(key))]
    if found_key:
        anDict[found_key[0]].append(word)
    else:
        anDict[word]=[]


>>> anDict
{'aba': ['baa', 'aab'], 'cat': ['tac', 'act'], 'sos': ['oss']}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM