简体   繁体   中英

How do I use regular expressions to search for words in queries?

Hi there I am currently working on a query solver code and I am using regular expressions to search for words in user entered queries.

However I have come up with the dilemma that the code I am using is not mollifying the concept of my original idea. The code is as follows:

def query():
    print ('Enter a query\n\nThe query must not have more than 30 characters.\n')
    while True:
        query = raw_input ('Query:  ')
        if 30> len(query):
            break
            print ('The query must have less than 30 chracters.\n')

def querysolver():
    query_words = dict.fromkeys(['screen_repair','Phone_virus','Water_damage', False])
    if re.search (r'[wet]', query):
                  query_words['Water_damage'] = True
    if re.search (r'[water]', query):
                  query_words['Water_damage'] = True
    if re.search (r'[wet]', query):
                  query_words['Water_damage'] = True
    if re.search (r'[screen]', query):
                  query_words['screen_repair'] = True
    if re.search (r'[smashed]', query):
                  query_words['screen_repair'] = True
    if re.search (r'[hacked]', query):
                  query_words['Phone_virus'] = True
    if re.search (r'[virus]', query):
                  query_words['Phone_virus'] = True

How would I then use these values to find a solution to the users query?

Regex is not the tool for this, and you are using it incorrectly. [wet] will match 'w', 'e', or 't'.

What you are doing in this code sample can be expressed much more easily as follows:

if 'wet' in query or 'water' in query:
      query_words['Water_damage'] = True
if 'screen' in query or 'smashed' in query:
      query_words['screen_repair'] = True
if 'hacked' in query or 'virus' in query:
      query_words['Phone_virus'] = True

Of course in does not check for word boundaries, so this would match shacked , but that should not be an issue with the keywords you are using, since the logic is rudimentary anyway.

It is not clear to me what you expect. First: look at the documentation of regular expressions, [wet] is true, if one of the three letters w, e or t is inside the query. If you try this you will see, that - if you insert "wet" nearly all your searches get true (w is inside [water],e is inside [screen][hacked][smashed] and so on. if you want to look for the whole word your regex must be "wet". If you only want "wet" and not "anywetthing" you could use "\\bwet\\b", because "\\b" matches on word breaks.

But there are some problems more: How do you want to send your input to the calculation ("querysolver")? To do this via a variable "query" you have some problems.

You could do something like:

import re 
def query():
    print ('Enter a query\n\nThe query must not have more than 30 characters.\n')
    while True:
        query = raw_input ('Query:  ')
        print len(query), query
        if 30 < len(query):
            print ('The query must have less than 30 chracters.\n')
            break
        else:
            print querysolver(query)


def querysolver(query):
    query_words = dict.fromkeys(['screen_repair','Phone_virus','Water_damage', False])
    if re.search (r'wet', query):
                  query_words['Water_damage'] = True
    if re.search (r'water', query):
                  query_words['Water_damage'] = True
    if re.search (r'wet', query):
                  query_words['Water_damage'] = True
    if re.search (r'screen', query):
                  query_words['screen_repair'] = True
    if re.search (r'smashed', query):
                  query_words['screen_repair'] = True
    if re.search (r'hacked', query):
                  query_words['Phone_virus'] = True
    if re.search (r'virus', query):
                  query_words['Phone_virus'] = True
    return query_words

query()

but you should not use the name "query" for the function and the input string. And there are some smarter ways for your if- construct

one example (not perfect, but better to extend to more patterns):

def querysolver2(query):
    query_words = dict.fromkeys(['screen_repair','Phone_virus','Water_damage'])
    for pattern in ['wet','water']:
        pattern = r'\b'+pattern+r'\b'
        if re.search(pattern,query):
            query_words['Water_damage'] = True
    for pattern in ['screen','smashed']:
        pattern = r'\b'+pattern+r'\b'
        if re.search(pattern,query):
            query_words['screen_repair'] = True
    for pattern in ['hacked','virus']:
        pattern = r'\b'+pattern+r'\b'
        if re.search(pattern,query):
            query_words['Phone_virus'] = True
    return query_words

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM