简体   繁体   English

试图找到两个字符串中的匹配 - Python

[英]Trying to find a match in two strings - Python

I have a user inputting two strings and then I want to check if there are any similar characters and if there is, get the position where the first similarity occurs, without using the find or index function. 我有一个用户输入两个字符串,然后我想检查是否有任何类似的字符,如果有,请获取第一个相似性发生的位置,而不使用查找或索引功能。

Below is what I have so far but I doesn't fully work. 以下是我到目前为止,但我没有完全工作。 With what I have so far, I'm able to find the similarities but Im not sure how to find the position of those similarities without using the index function. 根据我到目前为止,我能够找到相似之处,但我不知道如何在不使用索引函数的情况下找到这些相似之处。

string_a = "python"

string_b = "honbe"

same = []

a_len = len(string_a)
b_len = len(string_b)

for a in string_a:
    for b in string_b:

        if a == b:
            same.append(b)          

print (same)

Right now the output is: 现在输出是:

['h', 'o', 'n']

So basically what I am asking is, how can I find the position of those characters without using the Python Index function? 基本上我要问的是,如何在不使用Python索引功能的情况下找到这些字符的位置?

This is a perfect use case for difflib.SequenceMatcher : 这是difflib.SequenceMatcher的完美用例:

import difflib

string_a = 'python'
string_b = 'honbe'

matcher = difflib.SequenceMatcher(a=string_a, b=string_b)
match = matcher.find_longest_match(0, len(matcher.a), 0, len(matcher.b))

The match object will have the attributes a , b , and size , where a is the starting index from the string matcher.a , b is the starting index from matcher.b , and size is the length of the match. match的对象将具有属性ab ,和size ,其中a是从字符串的起始索引matcher.ab是从起点索引matcher.b ,并且size是匹配的长度。

For example: 例如:

>>> match
Match(a=3, b=0, size=3)
>>> matcher.a[match.a:match.a+match.size]
'hon'
>>> match.a
3
>>> match.b
0

You can solve this problem using a combination of list comprehensions and itertools. 您可以使用列表推导和itertools的组合来解决此问题。

import itertools
string_a = 'hello_world'
string_b = 'hi_low_old'

same = [ i for i,x in enumerate(itertools.izip(string_a,string_b)) if all(y==x[0] for y in x)]

In [38]: same
Out[38]: [0, 3, 4, 7]

Here we compare the two strings element by element and return all the indexes that have been found to be similar. 在这里,我们逐个元素地比较两个字符串,并返回已发现相似的所有索引。 The output can be easily changed to include the characters that matched etc. This method scales easily to compare multiple words. 可以轻松更改输出以包括匹配的字符等。此方法可以轻松扩展以比较多个单词。

You should iterate over the indices: 你应该迭代索引:

for i in range(len(string_a)):
    for j in range(len(string_b)):
        if string_a[i] == string_b[j]:
            same.append((i, j, string_b[j]))

This will create a list of tuples that look like: 这将创建一个类似于以下内容的元组列表:

[ (3, 0, "h"), ... ]
def find_similarity(string_a, string_b):
    for ia, ca in enumerate(string_a):
        for ib, cb in enumerate(string_b):
            if ca == cb:
                return ia, ib, ca

If you want all matches, instead of just the first, you can replace the return statement with a yield statement, and iterate over the results, or simply: 如果您想要所有匹配,而不是仅仅是第一个匹配,则可以使用yield语句替换return语句,并迭代结果,或者只是:

matches = list(find_similarity(string_a, string_b))

In the latter case, you get: 在后一种情况下,您得到:

list(find_similarity(string_a, string_b))
=> [(3, 0, 'h'), (4, 1, 'o'), (5, 2, 'n')]

If you just need to find indexes where letters overlap in Python 3.x you can do it like this: 如果您只需要在Python 3.x中找到字母重叠的索引,您可以这样做:

str_a = "Python is a great language"
str_b = "languages express meaning"

result = [i for i, (a, b) in enumerate(zip(str_a, str_b)) if a == b]

Output 产量

[8, 9, 13, 14, 17, 24]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM