Trying to check a string for comma separation. After I check the string, I'm going to use it to help load a SQL database so the words in the string can't be separated by anything other than a comma. I do have an approach that works but it seems very clunky for Python. Is there a more concise/less expensive way to check a string for comma separation?
Here's my attempt run in a Python 2.7.4 interpreter:
# List of possible Strings
comma_check_list = ['hello, world', 'hello world', 'hello, world',\
'hello world, good, morning']
# Dictionary of punctuation that's not a comma
punct_dict = {'@': True, '^': True, '!': True, ' ': True, '#': True, '%': True,\
'$': True, '&': True, ')': True, '(': True, '+': True, '*': True,\
'-': True, '=': True}
# Function to check the string
def string_check(comma_check_list, punct_dict):
for string in comma_check_list:
new_list = string.split(", ")
if char_check(new_list, punct_dict) == False:
print string, False
else:
print string, True
# Function to check each character
def char_check(new_list, punct_dict):
for item in new_list:
for char in item:
if char in punct_dict:
return False
# Usage
string_check(comma_check_list, punct_dict)
# Output
hello, world True
hello world False
hello, world False
hello world, good, morning False
Thank you in advance for your help!
for currentString in comma_check_list:
if any(True for char in currentString if char in '@^! #%$&)(+*-="'):
print currentString, False
else:
print currentString, True
@^! #%$&)(+*-="
@^! #%$&)(+*-="
are the characters you dont want them in the string. So, if any of the characters in the currentString
is in that list, we will print False
.
I would probably reduce your code to the following.
# List of possible Strings
comma_check_list = ['hello, world', 'hello world', 'hello, world', 'hello world, good, morning']
# Dictionary of punctuation that's not a comma
punct = set('@^! #%$&)(+*-="')
# Function to check the string
def string_check(comma_check_list, punct):
for string in comma_check_list:
new_list = string.split(", ")
print string, not any(char in punct for item in new_list for char in item)
# Usage
string_check(comma_check_list, punct)
Changes made.
set
since you are using the dictionary for look ups only. any
. if
condition. Output
In [6]: %run
hello, world True
hello world False
hello, world False
hello world, good, morning False
You should whitelist for valid SQL identifiers instead:
import re
ID_RE = re.compile(r'^[a-zA-Z_][a-zA-Z_0-9$]+$')
def is_sql_columns(columns):
return all(ID_RE.match(column_name.strip())
for column_name in columns.split(','))
### Test cases ###
def main():
test = [
'hello,world', # True
' hello , world ', # True
'hello world', # False
'!@#$%^&*,yuti', # False
'hello', # True
'hello,', # False
'a!b,c@d', # False
'' # False
]
for t in test:
print '{!r:>16}{!r:>8}'.format(t, is_sql_columns(t))
if __name__ == '__main__':
main()
This is a conservative RE for valid identifiers in PostgreSQL , it doesn't handle non-ASCII letters or quoted identifiers. It will also allow extra spaces between the words since those don't matter in SQL anyway.
Also remember that this will reject valid column lists for a SELECT
that use column aliases. (Eg SELECT first_name AS fname, last_name lname…
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.