繁体   English   中英

python 给定查询字符串找到一组相同开头的字符串

[英]python given query string find a set of strings with same beginning

编辑:我感谢所有答案,但谁能告诉我为什么我的解决方案不起作用? 我想尝试在没有 .startswith() 的情况下执行此操作,谢谢!

我正在尝试完成这个练习:

实施自动完成系统。 也就是说,给定一个查询字符串和一组所有可能的查询字符串,返回集合中所有以 s 为前缀的字符串。 例如,给定查询字符串 de 和字符串集合 [dog, deer, deal],返回 [deer, deal]。 提示:尝试将字典预处理为更有效的数据结构以加快查询速度。

但我得到一个空列表。 我可能做错了什么? 我以为这会给我[鹿,交易]

def autocomplete(string,set):
    string_letters = []
    letter_counter = 0
    list_to_return = []

    for letter in string:
        string_letters.append(letter)

    for words in set:
        for letter in words:
            if letter_counter == len(string):
                list_to_return.append(words)
            if letter == string_letters[letter_counter]:
                letter_counter += 1
            else:
                break
    return list_to_return

print(autocomplete("de", ["dog","deer","deal"]))

output:

[]

编辑:我感谢所有答案,但谁能告诉我为什么我的解决方案不起作用? 我想尝试在没有 .startswith() 的情况下执行此操作,谢谢!

以下是我将如何完成您正在尝试做的事情:

import re
strings = ['dog', 'deer', 'deal']
search = 'de'
pattern = re.compile('^' + search)
[x for x in strings if pattern.match(x)]

结果: ['deer', 'deal']

但是,在大多数情况下,对于这样的用例,您可能希望忽略搜索字符串和搜索字段的大小写。

import re
strings = ['dog', 'Deer', 'deal']
search = 'De'
pattern = re.compile('^' + search, re.IGNORECASE)
[x for x in strings if pattern.match(x)]

结果: ['Deer', 'deal']

要回答为什么您的代码不起作用的部分原因,它有助于在代码中添加一些详细信息:

def autocomplete(string,set):
    string_letters = []
    letter_counter = 0
    list_to_return = []

    for letter in string:
        string_letters.append(letter)

    for word in set:
        print(word)

        for letter in word:
            print(letter, letter_counter, len(string))
            if letter_counter == len(string):
                list_to_return.append(word)
            if letter == string_letters[letter_counter]:
                letter_counter += 1
            else:
                print('hit break')
                break
    return list_to_return

print(autocomplete("de", ["dog","deer","deal"]))

Output:

dog
('d', 0, 2)
('o', 1, 2)
hit break
deer
('d', 1, 2)
hit break
deal
('d', 1, 2)
hit break
[]

正如您在 output 中看到的狗“匹配但 o 不匹配”,这使得 letter_counter 为 1,然后在 deer 'd.='e' 上它会中断......这会一遍又一遍地持续下去。 有趣的是,由于这种行为,设置 'ddeer' 实际上会匹配,要解决这个问题,您需要在 for 循环中重置 letter_counter。 并有额外的断点来防止过度修改你的索引。

def autocomplete(string,set):
    string_letters = []
    list_to_return = []

    for letter in string:
        string_letters.append(letter)

    for word in set:
        # Reset letter_counter as it is only relevant to this word.
        letter_counter = 0
        print(word)

        for letter in word:
            print(letter, letter_counter, len(string))
            if letter == string_letters[letter_counter]:
                letter_counter += 1
            else:
                # We did not match break early
                break
            if letter_counter == len(string):
                # We matched for all letters append and break.
                list_to_return.append(word)
                break
    return list_to_return

print(autocomplete("de", ["dog","deer","deal"]))

我注意到了提示,但没有将其说明为要求,因此:

def autocomplete(string,set):
    return [s for s in set if s.startswith(string)]

print(autocomplete("de", ["dog","deer","deal"]))

str.startswith(n)将返回 boolean 值,如果 str 以n开头,则返回True ,否则返回False

您可以只使用startswith字符串function并避免所有这些计数器,如下所示:

def autocomplete(string, set):
    list_to_return = []

    for word in set:
        if word.startswith(string):
            list_to_return.append(word)
    return list_to_return

print(autocomplete("de", ["dog","deer","deal"]))

简化。

def autocomplete(string, set):
    back = []
    for elem in set:
        if elem.startswith(string[0]):
            back.append(elem)
    return back

print(autocomplete("de", ["dog","deer","deal","not","this","one","dasd"]))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM