繁体   English   中英

通过在python中另一个字符之前最后出现的字符来解析字符串

[英]parse string by the last occurrence of the character before another character in python

我有以下字符串,需要以正确的方式拆分为较小的字符串:

s = "A=3, B=value one, value two, value three, C=NA, D=Other institution, except insurance, id=DRT_12345"

我无法执行以下操作,因为我只需要在“ =”之前的最后一个“,”上进行拆分

s.split(",")

我期望的结果如下:

out = ["A=3",
 "B=value one, value two, value three", 
"C=NA",
 "D=Other institution, except insurance", 
"id=DRT_12345"]

按照字符串的结构,可以使用re.findall

import re

re.findall(r'\S+=.*?(?=, \S+=|$)', s)

['A=3',
 'B=value one, value two, value three',
 'C=NA',
 'D=Other institution, except insurance',
 'id=DRT_12345']

该模式使用前瞻性来确定何时停止匹配当前键值对。

\S+      # match or more non-whitespace characters 
=        # ...followed by an equal sign
.*?      # match anything upto...
(?=      # regex lookahead for
   ,     # comma, followed by
   \s    # a whitespace, followed by
   \S+   # the same pattern
   =
   |     # OR
   $     # EOL
)

可以将“等于前的最后一个逗号”拆分成如下的正则表达式:

import re
out = re.split(r',(?=[^,]*=)', s)

这是一个逗号( , ),然后是(正向超前- (?= .. ) )任意数量的非逗号字符( [^,]* ),然后是一个等号( = )。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM