简体   繁体   中英

using python regex on tuples to filter user inputs

I want to use regex to filter user input based on a set of tuples. An error message should be returned if the user input isn't found in the set of tuples and is not an alphanumeric character . I don't how I can access the tuples in my python regex code. So I passed in src.items() , how do I use the escape feature to get src.items() to bring in its values, or perhaps I should not be doing it this way.

My code:

import re

direction = ('north', 'south', 'east', 'west', 'down', 'up', 'left', 'right', 'back')
verb = ('go', 'stop', 'kill', 'eat')
stop = ('the', 'in', 'of', 'from', 'at', 'it')
noun = ('door', 'bear', 'princess', 'cabinet')    

src = {'direction': direction,
       'verb': verb,
       'stop': stop,
       'noun': noun
       }

# use this to pick out error strings from user input
    er = r"*[\W | src.items()]"
    ep = re.compile(er, re.IGNORECASE)

First, there's a redundancy here:

An error message should be returned if the user input isn't found in the set of tuples and is not an alphanumeric character

If the user input is in your set of tuples, how can it contain a nonalphanumeric character? Also you don't specify if you're testing individual words or complete phrases at a time.

Let's try a different approach. First, don't use two levels of data structure where one will do (ie just the dictionary.) Second, we'll switch the tuples to lists, not for technical reasons but for semantic ones (homogeneous -> lists, heterogeneous -> tuples). And we'll toss the regex for now in favor a simple split() and in test. Finally, we'll test complete phrases:

vocabulary = {
    'direction': ['north', 'south', 'east', 'west', 'down', 'up', 'left', 'right', 'back'],
    'verb': ['go', 'stop', 'kill', 'eat'],
    'stop': ['the', 'in', 'of', 'from', 'at', 'it'],
    'noun': ['door', 'bear', 'princess', 'cabinet']
    }

vocabulary_list = [word for sublist in vocabulary.values() for word in sublist]

phrases = ["Go in the east door", "Stop at the cabinet", "Eat the bear", "Do my taxes"]

# use this to pick out error strings from user input
for phrase in phrases:
    if any(term.lower() not in vocabulary_list for term in phrase.split()):
        print phrase, "-> invalid"
    else:
        print phrase, "-> valid"

PRODUCES

Go in the east door -> valid
Stop at the cabinet -> valid
Eat the bear -> valid
Do my taxes -> invalid

From here, you might considering allowing some puctuation like commas and periods and simply strip them rather than judge them.

This is not a good place to use regexps, and that is nothing like a valid Python regexp.

You are better off just checking whether the user input (maybe forced to lower case) is equal to any of the commands, in a loop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM