简体   繁体   English

基于文本限定符正则表达式java的拆分字符串

[英]split string based on text qualifier regex java

I want to split a string based on text qualifier for example例如,我想根据文本限定符拆分字符串

"1","10411721","MikeTison","08/11/2009","21/11/2009","2800.00","002934538","051","New York","10411720-002",".\Images\b.jpg",".\RTF\b.rtf"

Qualifer= " Spliter = , Qualifer= "拆分 = ,

I want to split string based on Spliter , but if Spliter comes inside qualifier " than ignore it and return string including Spliter .我想根据 Spliter 拆分字符串,但如果 Spliter 进入限定符",则忽略它并返回包括 Spliter 的字符串。

Regular expression i am using is (?:|,)(\\"(?:[^\\"]+|\\"\\")*\\"|[^,]*)我使用的正则表达式是(?:|,)(\\"(?:[^\\"]+|\\"\\")*\\"|[^,]*)

but this regular expression only returns commas,please help me in this perspective as i am new to regular expressions但是这个正则表达式只返回逗号,请从这个角度帮助我,因为我是正则表达式的新手

please note that if we have newline characters in string ie \\r\\n than it should ignore newline character请注意,如果我们在字符串中有换行符,即\\r\\n则它应该忽略换行符

"1","10411","Muis","a","21/11/2009","2800.06","0029683778","03005136851","Awan","10411720-001",".\Images\a.jpg",".\RTF\a.rtf"
"2","08/10/2009","07:32","Call","On-Net","030092343242342376543","Monk","00:00","1.500","0.000","10.000","0.200"
"2","08/10/2009","02:50","Call","Off-Net","030092343242342376543","Une","08:00","1.500","2.000","20.000","3.500"
"2","09/10/2009","03:55","SMS","On-Net","030092343242342376543","Mink","00:00","1.500","0.000","5.000","100.500"
"2","09/10/2009","12:30","Call","Off-Net","030092343242342376543","Zog","01:01","3.500","3.000","70.000","6.500"
"2","09/10/2009","09:11","Call","On-Net","030092343242342376543","Monk","02:30","2.00","2.000","90.000","4.000"

Probably easiest solution is not searching for place to split , but find ing elements which you want to return.可能最简单的解决方案不是搜索要split地方,而是find要返回的元素。 In your case these elements在你的情况下,这些元素

  • starts "开始"
  • ends with ""结尾
  • have no " inside.里面没有"

So you try with something like所以你尝试使用类似的东西

String data = "\"1\",\"10411721\",\"MikeTison\",\"08/11/2009\",\"21/11/2009\",\"2800.00\",\"002934538\",\"051\",\"New York\",\"10411720-002\",\".\\Images\\b.jpg\",\".\\RTF\\b.rtf\"";

Pattern p = Pattern.compile("\"([^\"]+)\"");
Matcher m = p.matcher(data);
while(m.find()){
    System.out.println(m.group(1));
}

Output:输出:

1
10411721
MikeTison
08/11/2009
21/11/2009
2800.00
002934538
051
New York
10411720-002
.\Images\b.jpg
.\RTF\b.rtf

Remove the first and the last character of the whole string.删除整个字符串的第一个和最后一个字符。 Then split with ","然后用“,”分割

String test = "\"1\",\"10411721\",\"MikeTison\",\"08/11/2009\",\"21/11/2009\",\"2800.00\",\"002934538\",\"051\",\"New York\",\"10411720-002\",\".\\Images\\b.jpg\",\".\\RTF\\b.rtf\"";

if (test.length() > 0)
  test = test.substring(1, test.length()-1);

System.out.println(Arrays.toString(test.split("\",\"")));

You can split using this regex:您可以使用此正则表达式拆分:

String[] arr = input.split( "(?=(([^\"]*\"){2})*[^\"]*$),+" );

This regex will split on commas if those are outside double quotes by using a lookahead to make sure there are even number of quotes after a comma.如果逗号位于双引号之外,则此正则表达式将在逗号上拆分,通过使用先行来确保逗号后有偶数个引号。

This works even if you have new line character..try it out即使您有换行符,这也有效..试试看

    String str="\"1\",\"10411721\",\"MikeTison\",\"08/11/2009\",\"21/11/2009\",\"2800.00\",\"002934538\",\"051\",\"New York\",\"10411720-002\",\".\\Images\\b.jpg\",\".\\RTF\\b.rtf\"";
    System.out.println(Arrays.toString(str.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)")));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM