简体   繁体   English

在元组上使用python regex过滤用户输入

[英]using python regex on tuples to filter user inputs

I want to use regex to filter user input based on a set of tuples. 我想使用正则表达式基于一组元组过滤用户输入。 An error message should be returned if the user input isn't found in the set of tuples and is not an alphanumeric character . 如果在set of tuples找不到用户输入并且该输入不是an alphanumeric character则应返回错误消息。 I don't how I can access the tuples in my python regex code. 我不知道如何在python regex代码中访问元组。 So I passed in src.items() , how do I use the escape feature to get src.items() to bring in its values, or perhaps I should not be doing it this way. 所以我传入了src.items() ,如何使用转义功能获取src.items()来引入其值,或者也许我不应该这样做。

My code: 我的代码:

import re

direction = ('north', 'south', 'east', 'west', 'down', 'up', 'left', 'right', 'back')
verb = ('go', 'stop', 'kill', 'eat')
stop = ('the', 'in', 'of', 'from', 'at', 'it')
noun = ('door', 'bear', 'princess', 'cabinet')    

src = {'direction': direction,
       'verb': verb,
       'stop': stop,
       'noun': noun

# use this to pick out error strings from user input
    er = r"*[\W | src.items()]"
    ep = re.compile(er, re.IGNORECASE)

First, there's a redundancy here: 首先,这里有一个冗余:

An error message should be returned if the user input isn't found in the set of tuples and is not an alphanumeric character 如果在元组集中找不到用户输入并且该输入不是字母数字字符,则应该返回错误消息

If the user input is in your set of tuples, how can it contain a nonalphanumeric character? 如果用户输入位于您的元组集中,那么它如何包含非字母数字字符? Also you don't specify if you're testing individual words or complete phrases at a time. 另外,您不指定是一次测试单个单词还是完整短语。

Let's try a different approach. 让我们尝试另一种方法。 First, don't use two levels of data structure where one will do (ie just the dictionary.) Second, we'll switch the tuples to lists, not for technical reasons but for semantic ones (homogeneous -> lists, heterogeneous -> tuples). 首先,不要使用两个级别的数据结构(即只是字典)。其次,我们将元组切换为列表,不是出于技术原因,而是出于语义原因(同质->列表,异质->元组)。 And we'll toss the regex for now in favor a simple split() and in test. 现在,我们将使用正则表达式,以简单的split() in测试。 Finally, we'll test complete phrases: 最后,我们将测试完整的短语:

vocabulary = {
    'direction': ['north', 'south', 'east', 'west', 'down', 'up', 'left', 'right', 'back'],
    'verb': ['go', 'stop', 'kill', 'eat'],
    'stop': ['the', 'in', 'of', 'from', 'at', 'it'],
    'noun': ['door', 'bear', 'princess', 'cabinet']

vocabulary_list = [word for sublist in vocabulary.values() for word in sublist]

phrases = ["Go in the east door", "Stop at the cabinet", "Eat the bear", "Do my taxes"]

# use this to pick out error strings from user input
for phrase in phrases:
    if any(term.lower() not in vocabulary_list for term in phrase.split()):
        print phrase, "-> invalid"
        print phrase, "-> valid"


Go in the east door -> valid
Stop at the cabinet -> valid
Eat the bear -> valid
Do my taxes -> invalid

From here, you might considering allowing some puctuation like commas and periods and simply strip them rather than judge them. 从这里开始,您可能会考虑允许一些动词,例如逗号和句点,并简单地剥离它们而不是对其进行判断。

This is not a good place to use regexps, and that is nothing like a valid Python regexp. 这不是使用正则表达式的好地方,这与有效的Python正则表达式完全不同。

You are better off just checking whether the user input (maybe forced to lower case) is equal to any of the commands, in a loop. 最好只是在循环中检查用户输入(可能被强制为小写)是否等于任何命令。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM