简体   繁体   中英

Less clunky way of checking a string for comma separation in Python?

Trying to check a string for comma separation. After I check the string, I'm going to use it to help load a SQL database so the words in the string can't be separated by anything other than a comma. I do have an approach that works but it seems very clunky for Python. Is there a more concise/less expensive way to check a string for comma separation?

Here's my attempt run in a Python 2.7.4 interpreter:

# List of possible Strings
comma_check_list = ['hello, world', 'hello world', 'hello,  world',\ 
                    'hello world, good, morning']

# Dictionary of punctuation that's not a comma
 punct_dict = {'@': True, '^': True, '!': True, ' ': True, '#': True, '%': True,\
               '$': True, '&': True, ')': True, '(': True, '+': True, '*': True,\ 
               '-': True, '=': True}

# Function to check the string
def string_check(comma_check_list, punct_dict):
    for string in comma_check_list:
        new_list = string.split(", ")
        if char_check(new_list, punct_dict) == False:
            print string, False
        else:
            print string, True

# Function to check each character
def char_check(new_list, punct_dict):
    for item in new_list:
        for char in item:
            if char in punct_dict:
                return False

# Usage
string_check(comma_check_list, punct_dict)

# Output
hello, world True
hello world False
hello,  world False
hello world, good, morning False

Thank you in advance for your help!

for currentString in comma_check_list:
    if any(True for char in currentString if char in '@^! #%$&)(+*-="'):
        print currentString, False
    else:
        print currentString, True

@^! #%$&)(+*-=" @^! #%$&)(+*-=" are the characters you dont want them in the string. So, if any of the characters in the currentString is in that list, we will print False .

I would probably reduce your code to the following.

# List of possible Strings
comma_check_list = ['hello, world', 'hello world', 'hello,  world', 'hello world, good, morning']

# Dictionary of punctuation that's not a comma
punct = set('@^! #%$&)(+*-="')

# Function to check the string
def string_check(comma_check_list, punct):
    for string in comma_check_list:
        new_list = string.split(", ")
        print string, not any(char in punct for item in new_list for char in item)

# Usage
string_check(comma_check_list, punct)

Changes made.

  1. Used a set since you are using the dictionary for look ups only.
  2. Used any .
  3. Print instead of the if condition.

Output

In [6]: %run 
hello, world True
hello world False
hello,  world False
hello world, good, morning False

You should whitelist for valid SQL identifiers instead:

import re

ID_RE = re.compile(r'^[a-zA-Z_][a-zA-Z_0-9$]+$')

def is_sql_columns(columns):
    return all(ID_RE.match(column_name.strip()) 
               for column_name in columns.split(','))

### Test cases ###

def main():
    test = [
        'hello,world',     # True
        ' hello , world ', # True
        'hello world',     # False
        '!@#$%^&*,yuti',   # False
        'hello',           # True
        'hello,',          # False
        'a!b,c@d',         # False
        ''                 # False
    ]

    for t in test:
        print '{!r:>16}{!r:>8}'.format(t, is_sql_columns(t))

if __name__ == '__main__':
    main()

This is a conservative RE for valid identifiers in PostgreSQL , it doesn't handle non-ASCII letters or quoted identifiers. It will also allow extra spaces between the words since those don't matter in SQL anyway.

Also remember that this will reject valid column lists for a SELECT that use column aliases. (Eg SELECT first_name AS fname, last_name lname… )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM