简体   繁体   中英

Comparing similarity of two strings in Python

I want to have the following:

Input: ("a#c","abc")
Output: True

Input:("a#c","abd")
Desired Output: False
Real Output: True

So the function returns True, if the two strings have the same length and if they differ only by the character #, which stands for a random character. If not, I want it to return False.

What should I change in this function?

def checkSolution(problem, solution):

    if len(problem) != len(solution): 
        return False

    if len(problem) == len(solution):
        for i, c in zip(problem, solution):
            if i != c:
                return False

            if i == c or "#" == c:
                return True

print (checkSolution("a#c","abc"))

print (checkSolution("a#c","abd"))

You are checking only the first character. You should not return True in case the first character is the same or it is # , but you should go on to find the first mismatch and return True only outside the for loop.

The second problem is that in your test cases the variable c is never '#' , since i is a character of problem , while c is a character of solution .

def checkSolution(problem, solution):
    if len(problem) != len(solution): 
        return False
    for i, c in zip(problem, solution):
        if i != '#' and c != '#' and i != c :
            return False
    return True

Right now you're only ever testing that the lengths and first characters match.

for i, c in zip(problem, solution):
    if i != c:
        # that's the first set of chars, but we're already returning??
        return False

    if i == c or "#" == c:
        # wildcard works here, but already would have failed earlier,
        # and still an early return if it IS true!
        return True

Instead you need to go through the whole string and return the result, or use all to do it for you.

if len(problem) == len(solution):
    for p_ch, s_ch in zip(problem, solution):
        if p_ch == "#":  # wildcard
            continue  # so we skip testing this character
        if p_ch != s_ch:
            return False  # This logic works now that we're skipping testing the "#"
    else:  # if we fall off the bottom here
        return True  # then it must be equal
else:
    return False

or in one line:

return len(problem) == len(solution) and \
       all(p_ch==s_ch or p_ch=="#" for p_ch, s_ch in zip(problem, solution)

or if you're really crazy (read: you like regular expressions way too much), you could do something like:

def checkSolution(problem, solution):
    return re.match("^" + ".".join(map(re.escape, problem.split("#"))) + "$",
                    solution)

As pointed out in the comments, your indentation is goofed up, and should be fixed.

if len(problem) == len(solution):
    # in the line below, 
    # 'i' will contain the next char from problem
    # 'c' will contain the next char from solution
    for i, c in zip(problem, solution):
        # in this line, if they're not equal, you return False
        # before you have a chance to look for the wildcard character
        if i != c:
            return False
        # ...and here you'd fail anyway, because you're testing 
        # the character from the solution string against the wildcard...
        if i == c or "#" == c:
            return True
# ...while in your test, you pass the wildcard in as part of the problem string.
print (checkSolution("a#c","abc"))

One line version of your function:

def check_solution(problem, solution):
    return (len(problem) == len(solution) and
            all(ch==solution[i] for i, ch in enumerate(problem) if ch != '#'))

Test:

>>> check_solution("a#c", "abc")
True
>>> check_solution("a#c", "abd")
False

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM