简体   繁体   English

求python中长度不等的两个字符串的区别

[英]Find the difference between two strings of uneven length in python

a = 'abcdfjghij'
b = 'abcdfjghi'

Output: j

def diff(a, b):
    string=''
    for val in a:
        if val not in b:
            string=val
    return string

a = 'abcdfjghij'
b = 'abcdfjghi'
print(diff(a,b))

This code returns an empty string.此代码返回一个空字符串。 Any solution for this?有什么解决办法吗?

collections.Counter from the standard library can be used to model multi-sets, so it keeps track of repeated elements.标准库中的collections.Counter可用于 model 多组,因此它可以跟踪重复的元素。 It's a subclass of dict which is performant and extends its functionality for counting purposes.它是dict的子类,它具有高性能并扩展了其功能以用于计数目的。 To find differences between two strings you can mimic a symmetric difference between sets.要找到两个字符串之间的差异,您可以模仿集合之间的对称差异

from collections import Counter

a = 'abcdfjghij'
b = 'abcdfjghi'

ca = Counter(a)
cb = Counter(b)

diff = (cb-ca)+(ca-cb) # symmetric difference

print(diff)
#Counter({'j': 1})

if I understand correctyl your question is:如果我理解正确你的问题是:

"given 2 strings of different length, how can I find the characters that are different between them?" “给定 2 个不同长度的字符串,我怎样才能找到它们之间不同的字符?”

So judging by your example, this implies you want either the characters that are only present in 1 of the strings and not on the other, or characters that might be repeated and which count is different in between the two strings.因此,根据您的示例判断,这意味着您要么想要仅存在于字符串 1 中而不存在于另一个字符串中的字符,要么想要可能重复且两个字符串之间的计数不同的字符。

Here's a simple solution (maybe not the most efficient one), but one that's short and does not require any extra packages:这是一个简单的解决方案(可能不是最有效的解决方案),但它很短且不需要任何额外的包:

**UPDATED: ** **更新: **

a = 'abcdfjghij'
b = 'abcdfjghi' 

dict_a = dict( (char, a.count(char)) for char in a)
dict_b = dict( (char, b.count(char)) for char in b)

idx_longest = [dict_a, dict_b].index(max([dict_a, dict_b], key = len))

results = [ k for (k,v) in [dict_a, dict_b][idx_longest].items() if k not in [dict_a, dict_b][1-idx_longest].keys() or v!=[dict_a, dict_b][1-idx_longest][k] ]

print(results)
 > ['j']

or you can try with other pair of strings such as或者您可以尝试使用其他一对字符串,例如

a = 'abcaa'
b = 'aaa'

print(results)
 > ['b', 'c']

as 'a' is in both string an equal number of times.因为'a'在两个字符串中的次数相等。

updated更新

But you have j twice in a.但是你在 a 中有两次 j。 So the first time it sees j it looks at b and sees aj, all good.所以它第一次看到 j 时,它会看到 b 并看到 aj,一切都很好。 For the second j it looks again and still sees aj, all good.对于第二个 j,它再次查看并且仍然看到 aj,一切都很好。 Are you wanting to check if each letter is the same as the other letter in the same sequence, then you should try this:您是否要检查每个字母是否与同一序列中的另一个字母相同,那么您应该尝试以下操作:

a = 'abcdfjghij'
b = 'abcdfjghi'

def diff(a, b):
  if len(a)>len(b):
    smallest_len = len(b)
    for index, value in enumerate(a[:smallest_len]):
      if a[index] != b[index]:
        print(f'a value {a[index]} at index {index} does not match b value {b[index]}')
    if len(a) == len(b):
      pass
    else:
      print(f'Extra Values in A Are {a[smallest_len:]}')
  else:
    smallest_len = len(a)
    for index, value in enumerate(b[:smallest_len]):
      if a[index] != b[index]:
        print(f'a value {a[index]} at index {index} does not match b value {b[index]}')
    if len(a) == len(b):
      pass
    else:
      print(f'Extra Values in B Are {b[smallest_len:]}')
  

diff(a, b)

在此处输入图像描述

In your example, there are 2 differences between the 2 strings: The letter g and j.在您的示例中,两个字符串之间有 2 个区别:字母 g 和 j。 I tested your code and it returns g because all the other letters from are in b:我测试了您的代码,它返回 g,因为来自的所有其他字母都在 b 中:

a = 'abcdfjghij'
b = 'abcdfjhi'

def diff(a, b):
    string=''
    for val in a:
        if val not in b:
            string=val
    return string

print(diff(a,b))

Its hard to know exactly what you want based on your question.根据您的问题,很难确切知道您想要什么。 Like should喜欢应该

'abc'
'efg'

return 'abc' or 'efg' or is there always just going to be one character added? return 'abc' 或 'efg' 还是总是只添加一个字符?

Here is a solution that accounts for multiple characters being different but still might not give your exact output.这是一个解决方案,它解释了多个不同的字符,但仍然可能无法给出您的确切 output。

def diff(a, b):
    string = ''
    
    if(len(a) >= len(b)):
        longString = a
        shortString = b
    else:
        longString = b
        shortString = a
    for i in range(len(longString)):
        if(i >= len(shortString) or longString[i] != shortString[i]):
            string += longString[i]
    return string

a = 'abcdfjghij'
b = 'abcdfjghi'
print(diff(a,b))

if one string just has one character added and i could be anywhere in the string you could change如果一个字符串只添加了一个字符,并且 i 可以在字符串中的任何位置,您可以更改

string += longString[i]

to

string = longString[i]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM