简体   繁体   English

最有效的方法 - 测试 2 个字谜 Python 的字符串

[英]The most efficient way - testing 2 strings for anagrams Python

There're many ways of testing if strings are anagrams .有很多方法可以测试字符串是否为 anagrams However, I wonder if there is a way to iterate over each word only once?但是,我想知道是否有一种方法可以只对每个单词进行一次迭代? And if not what's the most efficient way to do it in Python?如果不是,在 Python 中最有效的方法是什么?

We can traverse through the second string checking whether each character is present in the first string.我们可以遍历第二个字符串,检查每个字符是否存在于第一个字符串中。 However that gives us n-1 iterations over the first string at worst case scenario (reversgram), when using build-in __contains__() method ( __iter__() method is called).然而,当使用内置的__contains__()方法( __iter__()方法)时,这给了我们在最坏的情况下(reversgram)对第一个字符串进行 n-1 次迭代。

def is_anagram(str_1, str_2):
  #chceck if same length
  if (len(str_1) != len(str_2)):
    return False
  else:
    #lowercase all characters  
    str1, str2 = list(str_1.lower()),list(str_2.lower())
    for letter in str1:
      if letter not in str2:
        return False
      str2.remove(letter)
    return True

Is there any other way?还有其他方法吗?

if you can use Collections.Counter then it becomes simple, because if two words are anagrams they would have same keys and same values.如果你可以使用 Collections.Counter 那么它就变得简单了,因为如果两个词是字谜,它们将具有相同的键和相同的值。

from collections import Counter
def is_anagram(word1,word2):
    return Counter(word1)==Counter(word2)

word1 = 'ahbgrettf'
word2 = 'arethbfgt'

print(is_anagram(word1,word2)

to add on to @Maxime's answer if we use defaultdict we dont have to check if a key exists then check if keys match and values match to decide if its an anagram.如果我们使用 defaultdict 添加到@Maxime 的答案,我们不必检查键是否存在然后检查键是否匹配并且值是否匹配以确定它是否是字谜。

from collections import defaultdict

def is_anagram(word1,word2):
    table1, table2 = defaultdict(int), defaultdict(int)

    for c in word1:
        table1[c]+=1

    for c in word2:
        table2[c]+=1

    if set(table1.keys()) == set(table2.keys()):
        for k, v in table1.items():
            if table2[k]!=v:
                return False
    else:
        return False
    return True

print(is_anagram('ahbgrettf','arethbfgt'))

Maybe with dictionaries?也许用字典?

edit: added dan's suggestion编辑:添加了丹的建议

word1 = 'ahbgrettf'
word2 = 'arethbfgt'


def is_anagram(word1, word2):

    if (len(word1) != len(word2)):
        return False

    word_dic = {}

    # n iterations
    for char in word1:
        if word_dic.get(char):
            word_dic[char] += 1
        else:
            word_dic[char] = 1

    # n iterations
    for char in word2:
        if word_dic.get(char):
            word_dic[char] -= 1
        else:
            return False

    # n iterations
    for v in word_dic.values():
        if v != 0:
            return False

    return True


print(is_anagram(word1, word2))

total: 3n?总计:3n?

You can use defaultdict to have a default value and create a dictionary of letter frequencies and subtract from it with the other string with O(3n)您可以使用 defaultdict 来获得默认值并创建一个字母频率字典,并使用 O(3n) 将其与另一个字符串相减

from collections import defaultdict

def is_anagram2(str_1, str_2):
    #check if same length
    if (len(str_1) != len(str_2)):
        return False
    #creates a dictionary with default value of 0 for all keys
    str_1_dict = defaultdict(int)

    #adds how many of each letter in the dictionary
    for i in str_1:
        str_1_dict[i] += 1
    #subracts how many of each letter in the dictionary
    for i in str_2:
        str_1_dict[i] -= 1
    #checks to make sure all values are 0 (same number of each letter in both strings)
    for i in str_1_dict:
        if not str_1_dict[i] == 0:
            return False
    return True
is_anagram2('aaaa','aaaa')

I think using a dictionary is indeed the fastest since sorting takes at least O(nlogn).我认为使用字典确实是最快的,因为排序至少需要 O(nlogn)。 Creating dictionaries on the other hand should take O(n + n) or O(n) effectively.另一方面,创建字典应该有效地采用 O(n + n) 或 O(n)。 The.get() ensures if the key is not already there, return default 0 and then add 1 to insert the key and initialize the value to 1 in the dictionary. .get() 确保如果键不存在,则返回默认值 0,然后添加 1 以插入键并将值初始化为字典中的 1。 At the end, equating the two dictionaries makes sure the the same key:value pairs exist in both dictionaries.最后,将两个字典等同起来确保两个字典中存在相同的键:值对。 Optionally, you can check the length of two string and return a false in the beginning if the length does not match.或者,您可以检查两个字符串的长度,如果长度不匹配,则在开头返回 false。

def anagram_checker(str1, str2):
    str1 = str1.replace(" ", "").lower() #optional
    str2 = str2.replace(" ", "").lower() #optional

    str1_char_dict = {}
    str2_char_dict = {}

    for char in str1:
        str1_char_dict[char] = str1_char_dict.get(char, 0) + 1

    for char in str2:
        str2_char_dict[char] = str2_char_dict.get(char, 0) + 1

    return str1_char_dict == str2_char_dict

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM