简体   繁体   中英

python given query string find a set of strings with same beginning

Edit: I appreciate all the answers but could anyone tell me why my solution is not working? I wanted to try to do this without the.startswith() thank you!

I am trying to complete this excercise:

Implement an autocomplete system. That is, given a query string and a set of all possible query strings, return all strings in the set that have s as a prefix. For example, given the query string de and the set of strings [dog, deer, deal], return [deer, deal]. Hint: Try preprocessing the dictionary into a more efficient data structure to speed up queries.

But I get a empty list. What could I be doing wrong? I thought this would give me [deer, deal]

def autocomplete(string,set):
    string_letters = []
    letter_counter = 0
    list_to_return = []

    for letter in string:
        string_letters.append(letter)

    for words in set:
        for letter in words:
            if letter_counter == len(string):
                list_to_return.append(words)
            if letter == string_letters[letter_counter]:
                letter_counter += 1
            else:
                break
    return list_to_return

print(autocomplete("de", ["dog","deer","deal"]))

output:

[]

Edit: I appreciate all the answers but could anyone tell me why my solution is not working? I wanted to try to do this without the.startswith() thank you!

Here is how I would accomplish what you are trying to do:

import re
strings = ['dog', 'deer', 'deal']
search = 'de'
pattern = re.compile('^' + search)
[x for x in strings if pattern.match(x)]

RESULT: ['deer', 'deal']

However in most cases with a use case such as this, you might want to ignore the case of the search string and search field.

import re
strings = ['dog', 'Deer', 'deal']
search = 'De'
pattern = re.compile('^' + search, re.IGNORECASE)
[x for x in strings if pattern.match(x)]

RESULT: ['Deer', 'deal']

To answer the part of why your code does not work, it helps to add some verbosity to the code:

def autocomplete(string,set):
    string_letters = []
    letter_counter = 0
    list_to_return = []

    for letter in string:
        string_letters.append(letter)

    for word in set:
        print(word)

        for letter in word:
            print(letter, letter_counter, len(string))
            if letter_counter == len(string):
                list_to_return.append(word)
            if letter == string_letters[letter_counter]:
                letter_counter += 1
            else:
                print('hit break')
                break
    return list_to_return

print(autocomplete("de", ["dog","deer","deal"]))

Output:

dog
('d', 0, 2)
('o', 1, 2)
hit break
deer
('d', 1, 2)
hit break
deal
('d', 1, 2)
hit break
[]

As you can see in the output for dog 'd matched but o did not', this made the letter_counter 1, then upon deer 'd.= 'e' so it breaks... This perpetuates over and over. Interestingly setting 'ddeer' would actually match due this behavior, To fix this you need to reset the letter_counter in the for loop. and have additional break points to prevent over-reving your indexes.

def autocomplete(string,set):
    string_letters = []
    list_to_return = []

    for letter in string:
        string_letters.append(letter)

    for word in set:
        # Reset letter_counter as it is only relevant to this word.
        letter_counter = 0
        print(word)

        for letter in word:
            print(letter, letter_counter, len(string))
            if letter == string_letters[letter_counter]:
                letter_counter += 1
            else:
                # We did not match break early
                break
            if letter_counter == len(string):
                # We matched for all letters append and break.
                list_to_return.append(word)
                break
    return list_to_return

print(autocomplete("de", ["dog","deer","deal"]))

I notice the hint , but it's not stated as a requirement, so:

def autocomplete(string,set):
    return [s for s in set if s.startswith(string)]

print(autocomplete("de", ["dog","deer","deal"]))

str.startswith(n) will return a boolean value, True if the str starts with n , otherwise, False .

You can just use the startswith string function and avoid all those counters, like this:

def autocomplete(string, set):
    list_to_return = []

    for word in set:
        if word.startswith(string):
            list_to_return.append(word)
    return list_to_return

print(autocomplete("de", ["dog","deer","deal"]))

Simplify.

def autocomplete(string, set):
    back = []
    for elem in set:
        if elem.startswith(string[0]):
            back.append(elem)
    return back

print(autocomplete("de", ["dog","deer","deal","not","this","one","dasd"]))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM