简体   繁体   中英

Best way to perform string validation with user input in Python 3.4?

I have got a string as is:

line = 'City' /* City can be in 2 or 3 parts */

----> 2 parts: first char of each part is a capital letter.

----> 3 parts: first char of 1st and 2nd part is a capital letter.

The line I get is always valid because I check it with a regex already, I now would like to know what is the best way to ask the user for a character, then check if the input is the same as the City's first character (no matter what part of the city), if it is, print the City name to the output.

I am doing this for now but I'm learning Python for 2 days now and I'm kind of struggling right now.

line_ = "Mont de Marsan"

while True:
    inp = input('')
    if 'ABORT' in inp:
        inp = False
        sys.exit(0)                                                                                 
    else:
        inp = input('')
        for c in line_:
            if inp == c:
                print (line_)
            else:
                inp = False
                sys.exit(0)
                break

I hope the description of my problem is straight forward because it's getting messy in my mind :)

Could you please help me find the best way to perform such things in real time and for a lot of strings ?

 /* EDIT */

expected behaviour of programm if City is 'Pont de Marsan'

<---- d
----> Mont de Marsan
<---- P
----> Pont de Marsan
<---- M
----> Mont de Marsan
<---- l
 program exit.

Here's some more explanation:

I have a list of City, some can be 'Paris', some can be 'Mont de Marsan' or even 'Pont Neuf'. I now have to ask the user for a single character, if he enters P, I have to print 'Paris' and 'Pont Neuf', if he enters 'd' I have to print 'Mont de Marsan'. It's the same behaviour as the GPS system in cars.

"(...) if the input is the same as the City's first character (no matter what part of the city)"

You can split the city name by space using the string method split() to get a list of its parts:

>>> "Mont de Marsan".split() == ["Mont", "de", "Marsan"]
True
>>>

split() defaults to splitting by space but can split one string by any other string, eg

>>> "abc".split("b") == ['a', 'c']
True
>>> 

You can then go through each part of the city name and check what it starts with using the startswith() method or string indexing if you only want a specific number of letters; eg "Paris"[0] to match only the first letter "P".

You didn't mention it but I assume you also want case-insensitive matching so that both "p" and "P" will match "Paris", "Pont Neuf" and "pont neuf". To do this, you could simply convert your city names and the user input to the same case using lower() or upper() but since you're using Python 3.x, you might as well take advantage of the casefold() method that's made for this purpose. From the docs :

Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string. For example, the German lowercase letter 'ß' is equivalent to "ss" . Since it is already lowercase, lower() would do nothing to 'ß' ; casefold() converts it to "ss" .

In the snippet below, you're converting the user input to a boolean (it starts out as a string). There's nothing technically wrong in it - Python won't complain - but is it really what you want?

inp = input('')      <-- a string
if 'ABORT' in inp:   <-- still a string
    inp = False      <-- a boolean

Maybe you wanted to break the while loop using a boolean? In that case, you could do the following:

done = False

while not done:
    inp = input('') 
    if 'ABORT' in inp:
        done = True
    ...

sys.exit(0) 

The code above also eliminates the need to repeat sys.exit(0) throughout the code.

An edited version of your code. I might have changed it a bit too much but just use what you can:

import sys

cities = {"Mont de Marsan", "Pont Neuf", "Paris"}

done = False

while not done:
    inp = input('> ')                                                                                
    if any(inp):
        if 'ABORT' in inp:
            done = True
        else: 
            inp = inp.casefold()
            for city in cities:
                city_casefold = city.casefold()
                if city_casefold.startswith(inp):
                    print(city)
                else:
                    for part in city_casefold.split():
                        if part.startswith(inp):
                            print(city)
                            break
    else:
        done = True

sys.exit(0)

I saved the code in a script called suggestions.py . Testing it...

$ python3 suggestions.py 
> p
Pont Neuf
Paris
> P
Pont Neuf
Paris
> PAR
Paris
> AR
> DE
Mont de Marsan
> de
Mont de Marsan
> d
Mont de Marsan
> D
Mont de Marsan
> 

My strategy is to create a dictionary of sets of city names, with the initial letter of each word in a city name as the key. It doesn't take long to create this dictionary, but it makes finding the matching cities very fast.

My input loop ignores empty strings and leading or trailing blank spaces, and it prints "Nothing matches" if it can't find a match because I find it very annoying when a program closes just because I gave it bad input.

from collections import defaultdict

# Create list of city names
cities = '''\
Aix en Provence
Bordeaux
Clermont Ferrand
Le Mans
Le Havre
Limoges
Lyon
Marseille
Mont de Marsan
Montpellier
Nantes
Nice
Nîmes
Paris
Pont Neuf
Saint Denis
Saint Étienne
Strasbourg
Toulon
Toulouse
Tours
'''.splitlines()

# Build a dictionary of cities keyed by the
# 1st letter of each word of the city name
city_index = defaultdict(set)
for city in cities:
    for word in city.split():
        city_index[word[0]].add(city)

# Display the city index
for k in sorted(city_index.keys()):
    print(k, city_index[k])
print()

print('Select cities by an initial letter, or ABORT to quit')
while True:
    s = input('? ')

    #Remove leading or trailing whitespace
    s = s.strip()

    # Ignore empty input
    if not s:
        continue

    if s == 'ABORT':
        break

    #We only want a single leter, so discard anything after the first letter
    s = s[0]

    # Get matching cities
    matches = city_index.get(s)
    if matches:
        print(matches)
    else:
        print('Nothing matches')

Test run

A {'Aix en Provence'}
B {'Bordeaux'}
C {'Clermont Ferrand'}
D {'Saint Denis'}
F {'Clermont Ferrand'}
H {'Le Havre'}
L {'Le Mans', 'Limoges', 'Lyon', 'Le Havre'}
M {'Le Mans', 'Mont de Marsan', 'Marseille', 'Montpellier'}
N {'Pont Neuf', 'Nice', 'Nantes', 'Nîmes'}
P {'Pont Neuf', 'Aix en Provence', 'Paris'}
S {'Strasbourg', 'Saint Étienne', 'Saint Denis'}
T {'Toulon', 'Tours', 'Toulouse'}
d {'Mont de Marsan'}
e {'Aix en Provence'}
É {'Saint Étienne'}

Select cities by an initial letter, or ABORT to quit
? S
{'Strasbourg', 'Saint Étienne', 'Saint Denis'}
? T
{'Toulon', 'Tours', 'Toulouse'}
? A
{'Aix en Provence'}
? C
{'Clermont Ferrand'}
? K
Nothing matches
? d
{'Mont de Marsan'}
? F
{'Clermont Ferrand'}
? M
{'Le Mans', 'Mont de Marsan', 'Marseille', 'Montpellier'}
? E
Nothing matches
? É
{'Saint Étienne'}
?   Silly
{'Strasbourg', 'Saint Étienne', 'Saint Denis'}
? ABORT

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM