[英]Split string on comma when field contains a comma
Consider the following string: 考虑以下字符串:
'538.48,0.29,"533.59 - 540.00","AZO",102482,"+0.05%","N/A",0.00,535.09,"AutoZone, Inc. Co",538.77,"N/A"'
I need to split this into a list so it looks like the following: 我需要将其拆分为一个列表,如下所示:
[538.48, 0.29, "533.59 - 540.00", "AZO", 102482, "+0.05%" , "N/A", 0.00, 535.09, "AutoZone, Inc. Co", 538.77, "N/A"]
The problem is I can't use list.split(',')
because the 10th field has a comma within it. 问题是我不能使用
list.split(',')
因为第10个字段中包含逗号。 The question is then how best to split the original string into a list when arbitrary fields may have a comma? 问题是,当任意字段可能带有逗号时,如何最好地将原始字符串拆分为列表?
Use the csv
module rather than attempt to split this yourself, it handles quoted values, including quoted values containing the delimiter, out of the box: 使用
csv
模块而不是尝试自己拆分它,它可以立即处理带引号的值,包括包含定界符的带引号的值:
>>> import csv
>>> from pprint import pprint
>>> data = '538.48,0.29,"533.59 - 540.00","AZO",102482,"+0.05%","N/A",0.00,535.09,"AutoZone, Inc. Co",538.77,"N/A"'
>>> reader = csv.reader(data.splitlines())
>>> pprint(next(reader))
['538.48',
'0.29',
'533.59 - 540.00',
'AZO',
'102482',
'+0.05%',
'N/A',
'0.00',
'535.09',
'AutoZone, Inc. Co',
'538.77',
'N/A']
Note the 'AutoZone, Inc. Co'
column value. 注意
'AutoZone, Inc. Co'
列的值。
If you are reading this data from a file, pass in the file object to the csv.reader()
object directly rather than hand it sequences of strings. 如果要从文件读取此数据,请直接将文件对象传递给
csv.reader()
对象,而不是将其传递给字符串序列。
You can even have the numeric values (anything not quoted) interpreted as floating point values, by setting quoting=csv.QUOTE_NONNUMERIC
: 通过设置
quoting=csv.QUOTE_NONNUMERIC
,您甚至可以将数字值(任何未引用的值)解释为浮点值:
>>> reader = csv.reader(data.splitlines(), quoting=csv.QUOTE_NONNUMERIC)
>>> pprint(next(reader))
[538.48,
0.29,
'533.59 - 540.00',
'AZO',
102482.0,
'+0.05%',
'N/A',
0.0,
535.09,
'AutoZone, Inc. Co',
538.77,
'N/A']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.