简体   繁体   English

如何在只知道某些字符的情况下在单词表中查找单词

[英]How to find a word in a wordlist while knowing only certain characters

So I have a wordlist containing 3 words:所以我有一个包含 3 个词的词表:

Apple
Christmas Tree
Shopping Bag

And I know only certain characters in the word and the length of the word, for instance:而且我只知道单词中的某些字符和单词的长度,例如:

???i???as ?r?? ???i???as ?r??

where the ?在哪里 means it's an unknown character and I want to type it into the console and get an output of ALL the words in the word list containing these characters in these places and with this amount of characters.意味着它是一个未知字符,我想将它输入到控制台中,并获得单词列表中所有单词的输出,这些单词在这些地方包含这些字符并具有此数量的字符。

Is there any way I can achieve this?有什么办法可以实现这一目标吗? I want my program to function in the same way https://onelook.com/ works.我希望我的程序以与https://onelook.com/相同的方式运行。

You can turn your expression into a regex and try matching with that:您可以将表达式转换为正则表达式并尝试与之匹配:

import re

words = [
    'Apple',
    'Christmas Tree',
    'Shopping Bag'
]

match = '???i???as ?r??'
regex = '^' + match.replace('?', '.') + '$'  # turn your expression into a proper regex

for word in words:    # go through each word
    if re.match(regex, word):   # does the word match the regex?
        print(word)

Output:输出:

Christmas Tree

If you have a small word list, then you can run a regex scan on the whole word list.如果您有一个小的单词列表,那么您可以对整个单词列表运行正则表达式扫描。 Convert the query string into a regex and you're good to go.将查询字符串转换为正则表达式,您就可以开始了。

Otherwise, you can organize the word list in chunks, eg the equivalent of splitting it into several files in different directories:否则,您可以按块组织单词列表,例如相当于将其拆分为不同目录中的多个文件:

Appl => [ Apple ]
Chri => [ Christchurch, Christmas Tree, ... ]
Shop => [ Shopper, Shopping Bag, Shopkeeper, ]

(You can have several levels). (您可以有多个级别)。

Since your search query seems to be anchored , ie you know it starts at the word's beginning, when you look for "???i???as ?r??"由于您的搜索查询似乎是锚定的,即您知道它从单词的开头开始,因此当您查找“???i???as ?r??”时you see that "???i" will only match "Chri", and only look in that sub-list.您会看到“???i”只会匹配“Chri”,并且只会查看该子列表。

(Actually if you have to do this a lot of times, you'll better build a recursive search and organize the list as a n- or "m-ary" tree - here are some examples ). (实际上,如果您必须多次执行此操作,则最好构建递归搜索并将列表组织为n 或“m-ary”树- 这里有一些示例)。

尝试如果str在列表中。

IF "str" IN [list]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM