繁体   English   中英

基于逗号拆分,但忽略双引号内的逗号

[英]Split based on commas but ignore commas within double-quotes

我尝试根据逗号拆分字符串,避免使用双引号内的逗号。然后我需要将这些拆分字符串添加到列表中。

line = "DATA", "LT", "0.40", "1.25", "Sentence, which contain, 
commas", "401", "", "MN", "", "", "", "", ""

当我尝试这样做时

lineItems = line.split(",")

它根据所有逗号进行拆分。

相反,当我使用正则表达式进行拆分时,我将所有元素作为列表中的一个元素。 (不能拆分它们)。

有没有机会获得:

newlist  = ['DATA', 'LT', '0.40', '1.25', 'Sentence, which contain, 
    commas', '401', '', 'MN', '', '', '', '', '']

谢谢!

PS我将有许多类似的行,所以我想通过迭代得到类似的结果。

你可以使用shlex内置模块,像这样

import shlex
line = '"DATA", "LT", "0.40", "1.25", "Sentence, which contain, commas", "401", "", "MN", "", "", "", "", ""'

newlist = [x[:-1] for x in shlex.split(line)]

您提到您尝试拆分“字符串”变量。 因此,我假设您忘记添加适当的引号。 假设双引号平衡,以下内容是否有帮助?

import regex as re

line = """ "DATA", "LT", "0.40", "1.25", "Sentence, which contain, 
commas", "401", "", "MN", "", "", "", "", "" """

l = re.findall(r'"([^"]*)"', line)

print(l)

印刷:

['DATA', 'LT', '0.40', '1.25', 'Sentence, which contain, \ncommas', '401', '', 'MN', '', '', '', '', '']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM