简体   繁体   English

用逗号拆分字符串,除非在括号中,除非在逗号之前和/或之后直接是破折号“-”?

[英]Split a string by comma except when in bracket and except when directly before and/or after the comma is a dash "-"?

just trying to figure out how to plit a string by comma except when in bracket AND except when directly before and/or after the comma is a dash.只是想弄清楚如何用逗号分隔字符串,除非在括号中,除非在逗号之前和/或之后是破折号。 I have already found some good solutions for how to deal with the bracket problem but I do not have any clue how to extend this to my problem.对于如何处理括号问题,我已经找到了一些很好的解决方案,但我不知道如何将其扩展到我的问题。

Here is an example:这是一个例子:

example_string = 'A-la-carte-Küche, Garnieren (Speisen, Getränke), Kosten-, Leistungsrechnung, Berufsausbildung, -fortbildung'
aim = ['A-la-carte-Küche', 'Garnieren (Speisen, Getränke)', 'Kosten-, Leistungsrechnung', 'Berufsausbildung, -fortbildung']

So far, I have managed to do the following:到目前为止,我已经设法做到了以下几点:

>>> re.split(r',\s*(?![^()]*\))', example_string)
>>> out: ['A-la-carte-Küche', 'Garnieren (Speisen, Getränke)', 'Kosten-', 'Leistungsrechnung', 'Berufsausbildung', '-fortbildung']

Note the difference between aim and out for the terms 'Kosten-, Leistungsrechnung' and 'Berufsausbildung, -fortbildung'.请注意术语“Kosten-, Leistungsrechnung”和“Berufsausbildung, -fortbildung”的目标和出局之间的区别。 Would be glad if someone could help me out such that the output looks like aim.如果有人可以帮助我,使 output 看起来像目标,我会很高兴。

Thanks in advance!提前致谢!
Alex亚历克斯

If you can make use of the python regex module , you could do:如果你可以使用 python regex module ,你可以这样做:

\([^()]*\)(*SKIP)(*F)|(?<!-)\s*,\s*(?!,)

The pattern matches:模式匹配:

  • \([^()]*\) Match from an opening till closing parenthesis \([^()]*\)从左括号到右括号匹配
  • (*SKIP)(*F) Skip the match (*SKIP)(*F)跳过比赛
  • | Or或者
  • (?<,-)\s*?\s*(,!,) Match a comma between optional whitespace chars to split on (?<,-)\s*?\s*(,!,)匹配要拆分的可选空白字符之间的逗号

Regex demo正则表达式演示

import regex

example_string = 'A-la-carte-Küche, Garnieren (Speisen, Getränke), Kosten-, Leistungsrechnung, Berufsausbildung, -fortbildung'
print(regex.split(r"\([^()]*\)(*SKIP)(*F)|(?<!-)\s*,\s*(?!,)", example_string))

Output Output

['A-la-carte-Küche', ' Garnieren (Speisen, Getränke)', ' Kosten-, Leistungsrechnung', ' Berufsausbildung', ' -fortbildung']

You can use您可以使用

re.split(r'(?<!-),(?!\s*-)\s*(?![^()]*\))', example_string)

See the Python demo .请参阅Python 演示 Details :详情

  • (?<!-) - a negative lookbehind that fails the match if there is a - char immediately to the left of the current location (?<!-) - 如果当前位置的左侧紧邻有一个-字符,则匹配失败的否定后视
  • , - a comma , - 一个逗号
  • (?!\s*-) - a negative lookahead that fails the match if there is a - char immediately to the right of the current location (?!\s*-) - 如果当前位置的右侧立即有一个-字符,则匹配失败的否定前瞻
  • \s* - zero or more whitespaces \s* - 零个或多个空格
  • (?![^()]*\)) - a negative lookahead that fails the match if there are zero or more chars other than ) and ( and then a ) char immediately to the right of the current location. (?![^()]*\)) - 如果除了)(然后是)字符之外还有零个或多个字符,则匹配失败的否定前瞻性紧接在当前位置的右侧。

See the regex demo , too.也请参阅正则表达式演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM