a bit like this question: How to split comma-separated key-value pairs with quoted commas
But my question is:
line='name=zhg,code=#123,"text=hello,boy"'
Note, "text=hello,boy", NOT:text="hello,boy"
I'd like to separate the line to dict. The output I want is:
"name":"zhg","code":"#123","text":"hello,boy"
How to get it using regex or shlex?
You can't do that with the regex or it won't be the most efficient. The code to parse such string is straightforward using a single pass parser:
line='name=zhg,code=#123,"text=hello,boy"'
def read_quote(string):
out = ''
for index, char in enumerate(string):
if char == '"':
index += 2 # skip quote and comma if any
return index, out
else:
out += char
def read(string):
print('input', string)
out = ''
for index, char in enumerate(string):
if char == ',':
index += 1 # skip comma
return index, out
else:
out += char
# end of string
return index, out
def components(string):
index = 0
while index < len(line):
if string[index] == '"':
inc, out = read_quote(string[index+1:])
index += inc
yield out
else:
inc, out = read(string[index:])
index += inc
yield out
print(dict([e.split('=') for e in components(line)]))
It prints the following:
{'text': 'hello,boy', 'code': '#123', 'name': 'zhg'}
You can implement read
and read_quote
using a regex if you really want to.
You can use csv.reader
with a suitably "file-like" string.
>>> import csv
>>> import StringIO
>>> line='name=zhg,code=#123,"text=hello,boy"'
>>> string_file = StringIO.StringIO(line)
>>> for row in csv.reader(string_file):
... print row
...
['name=zhg', 'code=#123', 'text=hello,boy']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.