简体   繁体   English

比较字符串python中的字符

[英]Compare characters in strings python

I'm trying to compare the characters from 2 separate strings, the idea is that i will return a value corresponding to how many characters the strings both share. 我正在尝试比较两个单独字符串中的字符,我的想法是,我将返回一个值,该值对应于字符串共享的字符数。 for example, if string one was 'mouse' and string 2 was 'house'. 例如,如果字符串1是“鼠标”而字符串2是“房子”。 They would share 4/5 characters. 他们将分享4/5个字符。 its important to note that they only share a character if it is in the same 'index position' 重要的是要注意,如果它处于相同的“索引位置”,它们只共享一个字符

def compareWords(word1, word2):
result = 0
if word1[0] in word2[0]:
    result += 1
if word1[1] in word2[1]:
    result += 1
if word1[2] in word2[2]:
    result += 1
if word1[3] in word2[3]:
    result += 1
if word1[4] in word2[4]:
    result += 1
if word1[5] in word2[5]:
    result += 1
    print result, '/5'

工作结果

非工作结果

zip and sum : 邮编总和

a,b = "house", "mouse"

print(sum(s1 == s2 for s1, s2 in zip(a, b)))
4

zipping will pair the characters at the same index, then summing how many times s1 == s2 will give you the count of matching chars: 压缩将在同一索引处对字符进行配对,然后总结s1 == s2将给出匹配字符计数的次数:

In [1]: a,b = "house", "mouse"

In [2]: zip(a, b)
Out[2]: [('h', 'm'), ('o', 'o'), ('u', 'u'), ('s', 's'), ('e', 'e')]

The only thing that is unclear is what you use as the out of if the strings are of different lengths. 唯一不清楚的是如果字符串长度不同,你使用的是什么。

If you did want the matches and the sum you can still use the same logic: 如果你确实想要匹配和总和,你仍然可以使用相同的逻辑:

def paired(s1, s2):
    sm, index_ch = 0, []
    for ind, (c1, c2) in enumerate(zip(s1, s2)):
        if c1 == c2:
            sm += 1
            index_ch.append((ind, c1))
    return index_ch, sm

index_char, sm = paired("house", "mouse")

print(index_char, sm)

Output: 输出:

([(1, 'o'), (2, 'u'), (3, 's'), (4, 'e')], 4)

If you want to preserve the position and character of the matches, you can enumerate the strings, then calculate the intersection of the sets of the resulting tuples. 如果要保留匹配的位置和字符,可以枚举字符串,然后计算生成的元组的集合的交集。 If you don't want to preserve any information about the nature of the matches, I think Padraic's answer is better. 如果您不想保留有关比赛性质的任何信息,我认为Padraic的答案更好。

Demo: 演示:

>>> s1 = 'hello world'
>>> s2 = 'xelxx worxx'
>>> same = set(enumerate(s1)).intersection(enumerate(s2))
>>> same
set([(7, 'o'), (2, 'l'), (1, 'e'), (8, 'r'), (6, 'w'), (5, ' ')])
>>> len(same)
6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM