简体   繁体   English

Java中基于正则表达式的字符串拆分

[英]Regex based string split in Java

String delimiterRegexp = "(;|:|[^<]/)";
String value = "get/time/pick me <i>Jack</i>";
String[] splitedTexts = value.split(delimiterRegexp);
for (String text : splitedTexts) {
System.out.println(text);
}

Output:
ge
tim
pick me <i>Jack</i>

Expected Result: 
get
time
pick me <i>Jack</i>

A character is getting added as delimeter along with /. 字符将与/一起作为分隔符添加。 Could anyone help me out to write regex to split text based on delimeter"/" and it should ignore xml end tag" 任何人都可以帮我写正则表达式以基于分隔符“ /”分割文本,并且它应该忽略xml结束标记“

Your regex should be like this: 您的正则表达式应如下所示:

(;|:|(?<!<)/)

with a negative lookbehind, demo: https://regex101.com/r/2k1WI5/1/ 后面带有负面效果的演示: https ://regex101.com/r/2k1WI5/1/

Your current regex [^<]/ will match basically any character that is not < followed by / even \\n , space, and Japanese characters. 您当前的正则表达式[^<]/基本上将匹配所有非<后跟/甚至\\n ,空格和日语字符的字符。

That's why you are losing some letters as they are considered as part of the separator. 这就是为什么您会丢失一些字母,因为它们被视为分隔符的一部分。

Following The fourth bird recommendation, you can even simplify the regex into: ([;:]|(?<!<)/) 按照第四个鸟的建议,您甚至可以将正则表达式简化为: ([;:]|(?<!<)/) ;: ([;:]|(?<!<)/)

[^<]/ will match e/ and t/ [^<]/将匹配e/t/

use a lookbehind instead, it will have the wanted behaviour to only consider / as separator if it's not a closing tag 改用lookbehind,如果不是结束标记,它将只具有将/视为分隔符的期望行为

On regex101.com regex101.com上

(?<!<)/

The whole regex 整个正则表达式

(;|:|(?<!<)/)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM