繁体   English   中英

正则表达式:逗号分割,但在括号和引号中排除逗号(单双和双)

[英]Regex : Split on comma , but exclude commas within parentheses and quotes(Both single & Double)

我有一个字符串

5,(5,5),C'A,B','A,B',',B','A,',"A,B",C"A,B" 

我想在逗号上拆分它,但需要在括号和引号(单引号和双引号)中排除逗号。

像这样

5 (5,5) C'A,B' 'A,B' ',B' 'A,' "A,B" C"A,B"

用java Regular Expression如何实现这个?

你可以使用这个正则表达式:

String input = "5,(5,5),C'A,B','A,B',',B','A,',\"A,B\",C\"A,B\"";
String[] toks = input.split( 
                ",(?=(([^']*'){2})*[^']*$)(?=(([^\"]*\"){2})*[^\"]*$)(?![^()]*\\))" );
for (String tok: toks)
    System.out.printf("<%s>%n", tok);

输出:

<5>
<(5,5)>
<C'A,B'>
<'A,B'>
<',B'>
<'A,'>
<"A,B">
<C"A,B">

说明:

,                         # Match literal comma
(?=(([^']*'){2})*[^']*$)  # Lookahead to ensure comma is followed by even number of '
(?=(([^"]*"){2})*[^"]*$)  # Lookahead to ensure comma is followed by even number of "
(?![^()]*\\))             # Negative lookahead to ensure ) is not followed by matching
                          # all non [()] characters in between
,(?![^(]*\))(?![^"']*["'](?:[^"']*["'][^"']*["'])*[^"']*$)

尝试这个。

演示

对于java

,(?![^(]*\\))(?![^"']*["'](?:[^"']*["'][^"']*["'])*[^"']*$)

而不是split字符串,而是考虑匹配。

String s  = "5,(5,5),C'A,B','A,B',',B','A,',\"A,B\",C\"A,B\"";
Pattern p = Pattern.compile("(?:[^,]*(['\"])[^'\"]*\\1|\\([^)]*\\))|[^,]+");
Matcher m = p.matcher(s);
while (m.find()) {
  System.out.println(m.group());
}

产量

5
(5,5)
C'A,B'
'A,B'
',B'
'A,'
"A,B" 
C"A,B"

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM