简体   繁体   English

java-正则表达式拆分包含多个组的字符串

[英]java - Regex to split a string containing multiple groups

I need to split this string 我需要分割这串

(2005)[1]1,2,3,4[2]1(2008)[2]2–;3,4(2009)[3]1,2,3-4(2010)[4]1,2,3-4(2011)[5]1(2012)[5]2,3-4[6]1,2\[\](2014)[6]3-4[7]1-2(2015)[7]3-4[8]1-2(2016)[10]1[8]3-4[9]1-2,3-4(2017)[10]2

As: 如:

1, "1,2,3,4"  
2, 1 2
2, 2–;3,4

For the input "(2005)[1]1,2,3,4" I need value in [ ] in capture group 1 and the rest of the string (1,2,3,4) in capture group 2 and repeat for the entire string 对于输入“((2005)[1] 1,2,3,4”),我需要捕获组1中[]中的值,以及捕获组2中其余字符串(1,2,3,4)的值,并重复执行整个字符串

I have created this regex string but it is not working as intended 我已经创建了此正则表达式字符串,但无法正常工作

\[(.*?)\](.+?)(?=\[|\(|$)

Please see my regex implementation 请查看我的正则表达式实现

The problem is when there is nothing after [] it is capturing (year) which it should not do 问题是,当[]之后没有任何东西捕获(年份)时,它不应该这样做

The (.+?)(?=\\[|\\(|$) part of the pattern matches any 1 or more chars other than a newline up to the leftmost [ , ( or end of string. You need to allow matching zero or more chars here. 模式的(.+?)(?=\\[|\\(|$)部分匹配除换行符外的任何1个或多个字符,直到最左边的[(或字符串的末尾。您需要允许匹配或这里还有更多字符。

However, a [^\\[(] negated character class here will be more efficient and elegant: 但是,这里的[^\\[(]否定字符类将更加高效和优雅:

\[(.*?)\]([^\[(]*)

See this regex demo . 请参阅此正则表达式演示

Or a bit more efficient, 还是更有效率

\[([^\]\[]*)\]([^\[(]*)

See another regex demo . 参见另一个正则表达式演示

Details 细节

  • \\[ - a [ \\[ -一个[
  • ([^\\]\\[]*) - Group 1: any 0+ chars other than [ and ] ([^\\]\\[]*) -组1: []以外的任何0+个字符
  • \\] - a ] \\] -一个]
  • ([^\\[(]*) - Group 2: any 0+ chars other than [ and ( . ([^\\[(]*) -组2:除[(以外的任何0+个字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM