[英]Java String split with regex ignoring content in parenthesis
I would like to split a String such as "word1 AND word2 OR (word3 AND (word4 OR word5)) AND word6" with "AND" only outside from parenthesis to get : "word1" "word2 OR (word3 AND (word4 OR word5))" "word6" 我想将一个字符串,例如“ word1 AND word2 OR(word3 AND(word4 OR word5))AND word6”与“ AND”仅在括号之外进行拆分,以得到: “ word1”“ word2 OR(word3 AND(word4 AND word5) ))“” word6“
Note that a bloc of parenthesis can contain many other blocs of parenthesis. 请注意,圆括号可以包含许多其他圆括号。
I've done some researches and I've found a regex that does the opposite of what I want which is : (?:[^AND(]|\\([^)]*\\))+
This regex selects every thing but "AND" outside of parenthesis. 我做了一些研究,发现一个正则表达式与我想要的东西相反:
(?:[^AND(]|\\([^)]*\\))+
这个正则表达式选择了除括号外的“ AND”。 Also I tried lookahead and lookbehind but haven't been successful. 我也尝试了先行和后退,但没有成功。
Is there a way of doing what I'm asking with a regex ? 有没有办法用正则表达式来解决我要问的问题?
Thanks 谢谢
For Pattern.Compile methode you can use Pattern.DOTALL as parameter. 对于Pattern.Compile方法,您可以使用Pattern.DOTALL作为参数。 Code sampe is given
给出了代码样本
import java.util.regex.*;
public class Test
{
public static void main(String[] args)
{
String s="word1 AND word2 OR (word3 AND (word4 OR word5)) AND word6";
String regEx="(?:[^AND(]|\\([^)]*\\))+";
Pattern pattern = Pattern.compile(regEx, Pattern.DOTALL);
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println("Found the text \"" + matcher.group() + "\" starting at " + matcher.start() + " index and ending at index " + matcher.end());
}
}
}
Please try this. 请尝试这个。
Consider creating your own parser for this task (it is not that complicated). 考虑为该任务创建自己的解析器(它并不那么复杂)。
AND
from. AND
范围。 Create variable which will calculate level of nesting. (
and decrease it when you find )
. (
找到时降低此级别)
。
(
and you changed level from 0
to 1
then it is start of range, (
并且您将级别从0
更改为1
则它是范围的开始, )
and you changed level from 1
to 0
then it is end of range. )
并且将级别从1
更改为0
则它是范围的结尾。 AND
in your string ( indexOf(data,fromIndex)
can be helpful here) and check if it is outside of ranges you shouldn't split on. AND
位置( indexOf(data,fromIndex)
在这里可能会有所帮助),并检查它是否在不应分割的范围之外。 start,position
and update next start
to be after positoon+"AND".length()
. start,position
创建子字符串start,position
并在positoon+"AND".length()
之后更新下一个start
。 After this try to substring next part. After point 3 you should have all parts you are interested in. 在第3点之后,您应该拥有所有感兴趣的部分。
Below is example of parser class which seems to be doing what you want. 下面是解析器类的示例,该类似乎在执行您想要的操作。 To see it hover your mouse over it.
要查看它,请将鼠标悬停在它上面。 But before you use it try to create your own implementation.
但是在使用它之前,请尝试创建自己的实现。
class Parser { private static class Range { private int start, end; public Range(int start, int end) { this.start = start; this.end = end; } boolean isInside(int i) { return start <= i && i <= end; } public int getStart() { return start; } @Override public String toString() { return "Range [start=" + start + ", end=" + end + "]"; } } private List<Range> ranges = new ArrayList<Range>(); private boolean checkIfOutsideRanges(int i) { if (ranges.size() == 0) return true; if (ranges.get(0).getStart() > i) return true; for (Range r : ranges) { if (r.isInside(i)) return false; } return true; }
private List<Range> setUpRanges(String data) { int level = 0; int startOfRange = 0; int i = 0; for (char ch : data.toCharArray()) { if (ch == '(') { level++; if (level == 1) startOfRange = i; } if (ch == ')') { level--; if (level == 0) ranges.add(new Range(startOfRange, i)); } i++; } return ranges; }
public List<String> parse(String data) { String toFind = "AND"; ranges = setUpRanges(data); //find indexes of "AND" we should split on List<Integer> toSplit = new ArrayList<Integer>(); int i = -1; do { i = data.indexOf(toFind, i + 1); if (i != -1 && checkIfOutsideRanges(i)) toSplit.add(i); } while (i != -1);
//split on correct AND indexes List<String> results = new ArrayList<String>(); int start = 0; for (Integer index : toSplit) { results.add(data.substring(start, index)); start = index + toFind.length(); } if (start < data.length()) results.add(data.substring(start)); return results; } }
Usage example 使用范例
String data = "word1 AND ((word2 AND word3) AND word4) AND word5";
Parser p = new Parser();
for (String s : p.parse(data))
System.out.println(s);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.