简体   繁体   中英

Split a comma delimited string which may contain commas and escaped quotes between single quotes

How do you split a comma delimiter string. The string is a series of comma delimited numbers and words. Words are quoted with single quotes and numbers are not. Words may contain non-delimiting commas which should be fine inside quotes. Words may contain other kinds of quotes which must be escaped by a preceding backslash like so:

'','some-mail@some-domain.org','f4c1bfd5-969d-\'4,7\"2a-,b1\'29-42de49eb4406',2827,1378614418

I have tried to use a regex [^\\'] to split but that also picks up escaped commas.

I have tried literally counting the characters as an alternative but that is deathly slow.

Also python's csv reader splits the string on the non-delimiting commas if there are escaped slashes in the string. Perhaps it's not valid CSV?

The sub-string list I should have as a result is:

[ 
'', # empty string
'some-mail@some-domain.org', # text like email
'f4c1bfd5-969d-\'4,7\"2a-,b1\'29-42de49eb4406', # text, comma and escaped quotes
2827, # number
1378614418 # number
]

This is how I have used the csv module:

reader = csv.reader(StringIO(values_string), delimiter=',', quotechar="'", quoting=csv.QUOTE_ALL,skipinitialspace=True)

But I get:

['', 'some-mail@some-domain.org', 'f4c1bfd5-969d-\\4', '7\\"2a-', "b1\\'29-42de49eb4406'", '2827', '1378614418']
with open(file_name) as fp:
    reader = csv.reader(fp, quotechar="'", doublequote=False, escapechar='\\')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM