[英]Regular Expression, how to split with | and avoiding to split when \ is before
I have the next text 我有下一个文字
aaa|bbbb|cccc|dddd\|eeee|ffff
and i want to split by | 我想分开| and excluding when |
并且在|时排除 is preceded by \\ and obtain
之前是\\并获得
aaa AAA
bbbb BBBB
cccc CCCC
dddd\\|eeee DDDD \\ | EEEE
ffff FFFF
Thanks. 谢谢。
ps : i tried using some regexp generator (for example http://txt2re.com/ ) but frankly regexp is anything but friendly. ps:我尝试使用一些正则表达式生成器(例如http://txt2re.com/ )但坦率地说regexp不是很友好。
update: finally i give up. 更新:最后我放弃了。 Regexp is not fast (i did a benchmark), neither is clear (in comparison with a function that everybody can follow), then i skip it and now i am using real code.
Regexp并不快(我做了一个基准测试),既不清楚(与每个人都可以遵循的功能相比),然后我跳过它,现在我使用真正的代码。
This should do it: 这应该这样做:
(?<!\\\\)\\|
If you want to allow backslash-escaped backslashes, you can use: 如果要允许反斜杠转义的反斜杠,可以使用:
(?<!(?<!\\\\)\\\\)\\|
So given the string aaa|bbbb|cccc|dddd\\|eeee\\\\|ffff
, the split would be: 所以给定字符串
aaa|bbbb|cccc|dddd\\|eeee\\\\|ffff
,拆分将是:
aaa
bbbb
cccc
dddd|eeee\*
ffff
* Or dddd\\|eeee\\\\
if you're not stripping escape-backslashes for some reason. *或者
dddd\\|eeee\\\\
如果由于某种原因你没有剥离转义反斜杠。
Edit: not familiar with Java regular expression flavor, added escapes per ratchet freak's comment. 编辑:不熟悉Java正则表达式的味道,为每个棘轮怪物的评论添加了逃脱。
Tried to add this as a comment to eyelidlessness's answer, but don't know how to format it there... 试图将此添加为对eyelidlessness的答案的评论,但不知道如何在那里格式化...
Anyhow, eyelidlessness answer looks correct to me: 无论如何,眼睑的回答对我来说是正确的:
String str = "aaa|bbbb|cccc|dddd\\|eeee|ffff";
String[] tokens = str.split("(?<!\\\\)\\|");
System.out.println(Arrays.toString(tokens));
which prints: 打印:
[aaa, bbbb, cccc, dddd\|eeee, ffff]
Don't use split()
for this. 不要使用
split()
。 (You could if Java supported indefinite repetition inside lookbehind assertions. But it doesn't.) (如果Java在lookbehind断言中支持无限重复,你可以。但它没有。)
Better collect all the matches between |
更好地收集
|
之间的所有匹配 s: S:
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("(?:\\\\.|[^\\\\|])*");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
This correctly splits aaa|bbbb\\\\|cccc|dddd\\|eeee|ffff\\\\\\|ggg\\\\\\\\|hhhh
into 这正确地将
aaa|bbbb\\\\|cccc|dddd\\|eeee|ffff\\\\\\|ggg\\\\\\\\|hhhh
分成了
aaa
bbbb\\
cccc
dddd\|eeee
ffff\\\|ggg\\\\
hhhh
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.