[英]Split based on commas but ignore commas within double-quotes
我尝试根据逗号拆分字符串,避免使用双引号内的逗号。然后我需要将这些拆分字符串添加到列表中。
line = "DATA", "LT", "0.40", "1.25", "Sentence, which contain,
commas", "401", "", "MN", "", "", "", "", ""
当我尝试这样做时
lineItems = line.split(",")
它根据所有逗号进行拆分。
相反,当我使用正则表达式进行拆分时,我将所有元素作为列表中的一个元素。 (不能拆分它们)。
有没有机会获得:
newlist = ['DATA', 'LT', '0.40', '1.25', 'Sentence, which contain,
commas', '401', '', 'MN', '', '', '', '', '']
谢谢!
PS我将有许多类似的行,所以我想通过迭代得到类似的结果。
你可以使用shlex
内置模块,像这样
import shlex
line = '"DATA", "LT", "0.40", "1.25", "Sentence, which contain, commas", "401", "", "MN", "", "", "", "", ""'
newlist = [x[:-1] for x in shlex.split(line)]
您提到您尝试拆分“字符串”变量。 因此,我假设您忘记添加适当的引号。 假设双引号平衡,以下内容是否有帮助?
import regex as re
line = """ "DATA", "LT", "0.40", "1.25", "Sentence, which contain,
commas", "401", "", "MN", "", "", "", "", "" """
l = re.findall(r'"([^"]*)"', line)
print(l)
印刷:
['DATA', 'LT', '0.40', '1.25', 'Sentence, which contain, \ncommas', '401', '', 'MN', '', '', '', '', '']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.