简体   繁体   English

查找两个字符串的共同元素,包括多次出现的字符

[英]Find common elements of two strings including characters that occur many times

I would like to get common elements in two given strings such that duplicates will be taken care of.我想在两个给定的字符串中获取公共元素,以便处理重复项。 It means that if a letter occurs 3 times in the first string and 2 times in the second one, then in the common string it has to occur 2 times.这意味着如果一个字母在第一个字符串中出现 3 次,在第二个字符串中出现 2 次,那么在公共字符串中它必须出现 2 次。 The length of the two strings may be different.两个字符串的长度可能不同。 eg例如

s1 = 'aebcdee'
s2 = 'aaeedfskm' 
common = 'aeed'

I can not use the intersection between two sets.我不能使用两组之间的交集。 What would be the easiest way to find the result 'common' ?找到“常见”结果的最简单方法是什么? Thanks.谢谢。

Well there are multiple ways in which you can get the desired result.那么有多种方法可以获得所需的结果。 For me the simplest algorithm to get the answer would be:对我来说,得到答案的最简单算法是:

  1. Define an empty dict .定义一个空的dict Like d = {}d = {}
  2. Iterate through each character of the first string:遍历第一个字符串的每个字符:
    if the character is not present in the dictionary, add the character to the dictionary.如果字典中不存在该字符,则将该字符添加到字典中。
    else increment the count of character in the dictionary.否则增加字典中的字符数。
  3. Create a variable as common = ""创建一个变量为common = ""
  4. Iterate through the second string characters, if the count of that character in the dictionary above is greater than 0: decrement its value and add this character to common遍历第二个字符串字符,如果上面字典中该字符的计数大于 0:减少其值并将该字符添加到common
  5. Do whatever you want to do with the common做任何你想做common事情

The complete code for this problem:此问题的完整代码:

s1 = 'aebcdee'
s2 = 'aaeedfskm' 

d = {}

for c in s1:
    if c in d:
        d[c] += 1
    else:
        d[c] = 1

common = ""

for c in s2:
    if c in d and d[c] > 0:
        common += c
        d[c] -= 1

print(common)

You can use two arrays (length 26).您可以使用两个数组(长度为 26)。 One array is for the 1st string and 2nd array is for the second string.一个数组用于第一个字符串,第二个数组用于第二个字符串。

Initialize both the arrays to 0.将两个数组初始化为 0。

The 1st array's 0th index denotes the number of "a" in 1st string, 1st index denotes number of "b" in 1st string, similarly till - 25th index denotes number of "z" in 1st string.第一个数组的第 0 个索引表示第一个字符串中“a”的个数,第一个索引表示第一个字符串中的“b”个数,类似地,直到第 25 个索引表示第一个字符串中的“z”个数。

Similarly, you can create an array for the second string and store the count of each alphabet in their corresponding index.同样,您可以为第二个字符串创建一个数组,并将每个字母的计数存储在它们相应的索引中。

s1 = 'aebcdee' s2 = 'aaeedfs' Below is the array example for the above s1 and s2 values s1 = 'aebcdee' s2 = 'aaeedfs' 下面是上述 s1 和 s2 值的数组示例

字母计数后的数组

Now you can run through the 1st String s1 = 'aebcdee'现在您可以运行第一个字符串 s1 = 'aebcdee'

for each alphabet find the为每个字母找到

K = minimum of ( [ count(alphabet) in Array 1 ], [ count(alphabet) in Array 2 ] ) and print that alphabet K times. K = minimum of ( [ count(alphabet) in Array 1 ], [ count(alphabet) in Array 2 ] )并打印该字母K次。 then make that alphabet count to 0 in both the arrays.然后在两个数组中将该字母计数为 0。 (Because if you dint make it zero, then our algo might print the same alphabet again if it comes in the future). (因为如果您将其设为零,那么如果将来出现相同的字母表,我们的算法可能会再次打印)。

Complexity - O( length(S1) )复杂度 - O(长度(S1))

Note - You can also run through the string having a minimum length to reduce the complexity.注意 - 您还可以遍历具有最小长度的字符串以降低复杂性。 In that case Complexity - O( minimum [ length(S1), length(S2) ] )在这种情况下,复杂度 - O( 最小值 [ 长度(S1), 长度(S2) ] )

Please let me know if you want the implementation of this.如果您想执行此操作,请告诉我。

you can use collection.Counter and count each char in two string and if each char exist in two string using min of list and create a new string by join of them.您可以使用collection.Counter并计算两个字符串中的每个字符,如果每个字符存在于两个字符串中,则使用listmin并通过它们的连接创建一个新string

from collections import Counter, defaultdict
from itertools import zip_longest
s1 = 'aebcdee'
s2 = 'aaeedfskm' 

res = defaultdict(list)
cnt1 = Counter(s1)
cnt2 = Counter(s2)

for a,b in zip(cnt1 , cnt2):
    res[a].append(cnt1[a])
    res[b].append(cnt2[b])

# res -> {'a': [1, 2], 'e': [3, 2], 'b': [1], 'd': [1, 1], 'c': [1], 'f': [1], 's': [1], 'k': [1], 'm': [1]}


out = ''.join(k* min(v) for k,v in res.items() if len(v)>1)
print(out)
# aeed
s1="ckglter" 
s2="ancjkle" 
final_list=[] 
if(len(s1)<len(s2)): 
    for i in s1: 
            if(i in s2): 
                final_list.append(i) 
else: 
    for i in s2: 
            if(i in s1): 
                final_list.append(i) 
print(final_list)

you can also do it like this also, just iterate through both the string using for loop and append the common character into the empty list您也可以这样做,只需使用 for 循环遍历字符串并将公共字符附加到空列表中

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM