简体   繁体   English

Java正则表达式负向查找错误匹配

[英]Java Regular Expression Negative Look Ahead Finding Wrong Match

Assume I have the following string. 假设我有以下字符串。

create or replace package test as
-- begin null; end;/
end;
/

I want a regular expression that will find the semicolon not preceded by a set of "--" double dashes on the same line. 我想要一个正则表达式,该表达式将发现分号不要在同一行上以"--"双破折号开头。 I'm using the following pattern "(?!--.*);" 我正在使用以下模式"(?!--.*);" and I'm still getting matches for the two semicolons on the 2nd line. 而且我还在第二行找到两个分号的匹配项。

I feel like I'm missing something about negative look aheads but I can't figure out what. 我觉得我缺少关于负面展望的信息,但我不知道该怎么办。

First of all, what you need is a negative lookbehind (?<!) and not a negative lookahead (?!) since you want to check what's behind your potential match. 首先,您需要的是负向后看(?<!)而不是负向前看(?!)因为您想检查潜在匹配背后的原因。

Even with that, you won't be able to use the negative lookbehind in your case since the Java's regex engine does not support variable length lookbehind. 即使这样,由于Java的regex引擎不支持可变长度后视,因此您将无法使用负后视。 This means that you need to know exactly how many characters to look behind your potential match for it to work. 这意味着您需要确切地知道在潜在匹配项后面有多少个字符才能起作用。

With that said, wouldn't be simpler in your case to just split your String by linefeed/carriage return and then remove the line that start with "--"? 话虽如此,仅通过换行/回车符分割String然后删除以“-”开头的行,会不会更简单?

If you want to match semicolons only on the lines which do not start with -- , this regex should do the trick: 如果你想匹配行分号不下手--这正则表达式应该做的伎俩:

^(?!--).*(;)

Example

I only made a few changes from your regex: 我仅对您的正则表达式进行了一些更改:

  1. Multi-line mode, so we can use ^ and $ and search by line 多行模式,因此我们可以使用^$并按行搜索

  2. ^ at the beginning to indicate start of a line 开头的^表示行的开头

  3. .* between the negative lookahead and the semicolon, because otherwise with the first change it would try to match something like ^; .*在负前瞻和分号之间,因为否则第一次更改时,它将尝试匹配类似^; , which is wrong ,这是错误的

(I also added parentheses around the semicolon so the demo page displays the result more clearly, but this is not necessary and you can change to whatever is most convenient for your program.) (我还在分号周围添加了括号,因此演示页面可以更清楚地显示结果,但这不是必需的,您可以更改为最适合您的程序的方式。)

The reason "(?!--.*);" 原因"(?!--.*);" isn't working is because the negative look ahead is asserting that when positioned before a ; 之所以不起作用是因为负面的展望断言,当定位在; that the next two chars are -- , which of course matches every time ( ; is always not -- ). 接下来的两个字符是-- ,当然每次都匹配( ;总是不-- )。

In java, to match a ; 在Java中,匹配; that doesn't have -- anywhere before it: 不具备--随时随地收到:

"\\G(((?<!--)[^;])*);"

To see this in action using a replaceAll() call: 要使用replaceAll()调用查看此操作,请执行以下操作:

String s = "foo; -- begin null; end;";
s = s.replaceAll("\\G(((?<!--)[^;])*);", "$1!");
System.out.println(s);

Output: 输出:

foo! -- begin null; end;

Showing that only semi colons before a double dash are matched. 显示仅匹配双破折号之前的半冒号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM