简体   繁体   English

"<i>How can I check if a string has the same characters?<\/i>如何检查字符串是否具有相同的字符?<\/b> <i>Python<\/i> Python<\/b>"

[英]How can I check if a string has the same characters? Python

I need to be able to discern if a string of an arbitrary length, greater than 1 (and only lowercase), has the same set of characters within a base or template string.我需要能够辨别任意长度的字符串是否大于 1(并且只有小写字母)在基本字符串或模板字符串中具有相同的字符集。

For example, take the string "aabc": "azbc" and "aaabc" would be false while "acba" would be true.例如,以字符串“aabc”为例:“azbc”和“aaabc”为假,而“acba”为真。

Is there a fast way to do this in python without keeping track of all the permutations of the first string and then comparing it to the test string?有没有一种快速的方法可以在 python 中执行此操作,而无需跟踪第一个字符串的所有排列,然后将其与测试字符串进行比较?

"

Sort the two strings and then compare them: 对两个字符串进行排序,然后比较它们:

sorted(str1) == sorted(str2)

If the strings might not be the same length, you might want to make sure of that first to save time: 如果字符串的长度可能不同,您可能需要首先确保它们节省时间:

len(str1) == len(str2) and sorted(str1) == sorted(str2)

This is the O(n) solution 这是O(n)解决方案

from collections import Counter
Counter(str1) == Counter(str2)

But the O(n * log n) solution using sorted is likely faster for sensible values of n O(n * log n)溶液中使用sorted可能是更快的合理值n

Here's a variation on @Joowani's solution that only uses one dictionary and runs even faster (at least on my machine) : 这是@Joowani解决方案的变体,只使用一个字典并且运行得更快(至少在我的机器上):

def cmp4(str1, str2):
    if len(str1) != len(str2):
        return False
    d = collections.defaultdict(int)
    for c in str1:
        d[c] += 1
    for c in str2:
        d[c] -= 1
    return all(v == 0 for v in d.itervalues())

Here is another O(n) solution, longer but slightly faster than others: 这是另一个O(n)解决方案,比其他解决方案更长但更快:

def cmp(str1, str2):
    if len(str1) != len(str2):
        return False

    d, d2 = {}, {}
    for char in str1:
        if char not in d:
            d[char] = 1
        else:
            d[char] += 1
    for char in str2:
        if char not in d:
            return False
        if char not in d2:
            d2[char] = 1
        else:
            d2[char] += 1

    return d == d2

It basically does the same thing as gnibber's solution (but for some strange reasons the Counter() from collections library seems quite slow). 它基本上与gnibber的解决方案做同样的事情(但由于一些奇怪的原因,来自集合库的Counter()看起来很慢)。 Here are some timeit results: 以下是一些时间结果:

setup = '''
import collections
from collections import Counter

s1 = "abcdefghijklmnopqrstuvwxyz" * 10000
s2 = s1[::-1]

def cmp1(str1, str2):
    if len(str1) != len(str2):
        return False

    d, d2 = {}, {}
    for char in str1:
        if char not in d:
            d[char] = 1
        else:
            d[char] += 1
    for char in str2:
        if char not in d:
            return False
        if char not in d2:
            d2[char] = 1
        else:
            d2[char] += 1
    return d == d2

def cmp2(str1, str2):
    return len(str1) == len(str2) and sorted(str1) == sorted(str2)

def cmp3(str1, str2):    
    return Counter(str1) == Counter(str2)

def cmp4(str1, str2):
    if len(str1) != len(str2):
        return False
    d = collections.defaultdict(int)
    for c in str1:
        d[c] += 1
    for c in str2:
        d[c] -= 1
    return all(v == 0 for v in d.itervalues())
'''

    timeit.timeit("cmp1(s1, s2)", setup=setup, number = 100)
    8.027034027221656
    timeit.timeit("cmp2(s1, s2)", setup=setup, number = 100)
    8.175071701324946
    timeit.timeit("cmp3(s1, s2)", setup=setup, number = 100)
    14.243422195893174
    timeit.timeit("cmp4(s1, s2)", setup=setup, number = 100)
    5.0937542822775015

Also, David's solution comes out on top when the string sizes are small and they actually have same characters. 此外,当字符串大小很小并且它们实际上具有相同的字符时,David的解决方案在顶部出现。

EDIT: updated the test results 编辑:更新测试结果

Heres a different way. 这是另一种方式。 By using what we ignore the most "sets": 通过使用我们忽略最多的“集合”:

if len(set(str1) - set(str2)) == 0:
    print "Yes"

If you have a very long string, the following solution will be helpful with O(n) time complexity.如果您有一个很长的字符串,以下解决方案将有助于 O(n) 时间复杂度。 You can also use an hash map\/dictionary instead of the arrays\/lists.您还可以使用哈希映射\/字典代替数组\/列表。

s1 = "sjkhdfkaljdhfaldflflad"
s2 = "lsdhfuisfslffsdjdkllja"

if len(s1)!=len(s2):
   return False

ds1 = [0] * 26
ds2 = [0] * 26

for i in range(len(s1)):
   ds1[ord(s1[i])-ord("a")] +=1 
   ds2[ord(s2[i])-ord("a")] +=1

return ds1 == ds2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python:如何检查字符串是否具有相同的字符/重复它们的概率是否相同 - python: how to check if string has the same characters / the probability of repeating them is the same 如何检查2个字符串是否具有相同的字符? - python - how to check if 2 string have same characters or no? - python Python 检查字符串是否有字符 - Python check if a string has characters 检查字符串在python中是否只有白色字符 - check if a string has only white characters in python 如何检查 python 中字符串中的这个或那个字符? - how to check this or that characters in string in python? 如何在python中对字符串进行排序? - How can I sort a string of characters in python? 我该如何在python中编写一个正则表达式,该正则表达式在字符串的第一个句点停止,该字符串包含不可预测的字符类型? - How can I write a regular expression in python that stops at the first period in a string, which has unpredictable kinds of characters? 如何检查字符串中的字符是否在值字典中? - How can I check if the characters in a string are in a dictionary of values? 检查字符串是否具有相同频率的字符,无论是否删除 1 个字符 - check if a string has characters of same frequency with or without the removal of 1 character 如何检查 python 列表中连续字符的长度? - How can I check the length of consecutive characters in a list in python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM