简体   繁体   English

为什么这个前瞻断言在 Java 中不起作用?

[英]Why isn't this lookahead assertion working in Java?

I come from a Perl background and am used to doing something like the following to match leading digits in a string and perform an in-place increment by one:我来自 Perl 背景,习惯于执行以下操作来匹配字符串中的前导数字并执行就地递增一:

my $string = '0_Beginning';

$string =~ s|^(\d+)(?=_.*)|$1+1|e;

print $string;        # '1_Beginning'

With my limited knowledge of Java, things aren't so succinct:由于我对 Java 的了解有限,事情并不是那么简洁:

String string = "0_Beginning";

Pattern p = Pattern.compile( "^(\\d+)(?=_.*)" );

String digit = string.replaceFirst( p.toString(), "$1" ); // To get the digit

Integer oneMore = Integer.parseInt( digit ) + 1;          // Evaluate ++digit

string.replaceFirst( p.toString(), oneMore.toString() );  //

The regex doesn't match here... but it did in Perl.正则表达式在这里不匹配......但它在 Perl 中匹配。

What am I doing wrong here?我在这里做错了什么?

Actually it matches.其实是相配的。 You can find out by printing您可以通过打印了解

System.out.println(p.matcher(string).find());

The issue is with line问题在于线路

String digit = string.replaceFirst( p.toString(), "$1" );

which is actually a do-nothing, because it replaces the first group (which is all you match, the lookahead is not part of the match) with the content of the first group.这实际上是无所事事,因为它用第一组的内容替换了第一组(这是您匹配的所有内容,前瞻不是匹配的一部分)。

You can get the desired result (namely the digit) via the following code您可以通过以下代码获得所需的结果(即数字)

Matcher m = p.matcher(string);
String digit = m.find() ? m.group(1) : "";

Note: you should check m.find() anyways if nothing matches.注意:如果没有匹配项,你应该检查m.find() In this case you may not call parseInt and you'll get an error.在这种情况下,您可能不会调用parseInt并且会收到错误消息。 Thus the full code looks something like因此完整的代码看起来像

Pattern p = Pattern.compile("^(\\d+)(?=_.*)");

String string = "0_Beginning";

Matcher m = p.matcher(string);
if (m.find()) {
    String digit = m.group(1);
    Integer oneMore = Integer.parseInt(digit) + 1;
    string = m.replaceAll(oneMore.toString());
    System.out.println(string);
} else {
    System.out.println("No match");
}

Let's see what you are doing here.让我们看看你在这里做什么。

String string = "0_Beginning";
Pattern p = Pattern.compile( "^(\\d+)(?=_.*)" );

You declare and initialize String and pattern objects.您声明和初始化字符串和模式对象。

String digit = string.replaceFirst( p.toString(), "$1" ); // To get the digit

(You are converting the pattern back into a string, and replaceFirst creates a new Pattern from this. Is this intentional?) (您将模式转换回字符串,replaceFirst 从中创建一个新模式。这是故意的吗?)

As Howard says, this replaces the first match of the pattern in the string with the contents of the first group, and the match of the pattern is just 0 here, as the first group.正如霍华德所说,这将字符串中模式的第一个匹配替换为第一组的内容,而模式的匹配在这里只是0 ,作为第一组。 Thus digit is equal to string , ...因此digit等于string ,...

Integer oneMore = Integer.parseInt( digit ) + 1;          // Evaluate ++digit

... and your parsing fails here. ...您的解析在这里失败。

string.replaceFirst( p.toString(), oneMore.toString() );  //

This would work (but convert the pattern again to string and back to pattern).这将起作用(但将模式再次转换为字符串并返回模式)。

Here how I would do this:我将如何做到这一点:

String string = "0_Beginning";
Pattern p = Pattern.compile( "^(\\d+)(?=_.*)" );

Matcher matcher = p.matcher(string);
StringBuffer result = new StringBuffer();
while(matcher.find()) {
    int number = Integer.parseInt(matcher.group());
    m.appendReplacement(result, String.valueOf(number + 1));
}
m.appendTail(result);
return result.toString(); // 1_Beginning

(Of course, for your regex the loop will only execute once, since the regex is anchored.) (当然,对于您的正则表达式,循环只会执行一次,因为正则表达式是锚定的。)


Edit : To clarify my statement about string.replaceFirst:编辑:澄清我关于 string.replaceFirst 的声明:

This method does not return a pattern, but uses one internally.此方法不返回模式,而是在内部使用一个模式。 From the documentation : 从文档中

Replaces the first substring of this string that matches the given regular expression with the given replacement.用给定的替换替换此字符串中与给定正则表达式匹配的第一个 substring。

An invocation of this method of the form str.replaceFirst(regex, repl) yields exactly the same result as the expression调用str.replaceFirst(regex, repl)形式的此方法会产生与表达式完全相同的结果

Pattern.compile(regex).matcher(str).replaceFirst(repl)

Here we see that a new pattern is compiled from the first argument.在这里,我们看到从第一个参数编译了一个新模式。

This also shows us another way to do what you did want to do:这也向我们展示了另一种方法来做你想做的事情:

String string = "0_Beginning";
Pattern p = Pattern.compile( "^(\\d+)(?=_.*)" );
Matcher m = p.matcher(string);
if(m.find()) {
    digit = m.group();
    int oneMore = Integer.parseInt( digit ) + 1
    return m.replaceFirst(string, String.valueOf(oneMore));
}

This only compiles the pattern once, instead of thrice like in your original program - but still does the matching twice (once for find, once for replaceFirst ), instead of once like in my program.这只会编译一次模式,而不是像在原始程序中那样编译三次 - 但仍然会匹配两次(一次用于 find,一次用于replaceFirst ),而不是像我的程序中那样一次。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM