简体   繁体   English

Java-正则表达式匹配中的意外结果

[英]java - Unexpected result in Regex match

I'm trying to check if each line is equal to "test". 我正在尝试检查每一行是否等于“测试”。 When I try to run the following code, I expect the result to be true, because every line is exactly "test". 当我尝试运行以下代码时,我希望结果为true,因为每一行都是“测试”。 Yet, the result is false. 然而,结果是错误的。

// Expected outcome:
// "test\ntest\ntest" - should match
// "test\nfoo\ntest" - should not match
// "test\ntesttest\ntest" - should not match

Pattern pattern = Pattern.compile("^test$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher("test\ntest");

System.out.println(matcher.matches()); // result is false

What am I missing here? 我在这里想念什么? Why is the result false? 为什么结果是假的?

Since you're using Pattern.MULTILINE , it's matching against the whole string test\\ntest . 由于您正在使用Pattern.MULTILINE ,因此它与整个字符串test\\ntest匹配。 But in your regex, you are specifying that the string should consist of only a single instance of test , since it's surrounded by the start and end anchors. 但是在您的正则表达式中,您指定该字符串应仅由test的单个实例组成,因为它由开始和结束锚点包围。

With Pattern.compile("^test$", Pattern.MULTILINE) , you only ask the regex engine to match one single line to be equal to test . 使用Pattern.compile("^test$", Pattern.MULTILINE) ,您只要求正则表达式引擎匹配一行等于test When using Matcher#matches() , you tell the regex engine to match the full string. 使用Matcher#matches() ,您告诉正则表达式引擎匹配完整字符串。 Since your string is not equal to test , you will get false as the result. 由于您的字符串不等于test ,因此结果为false

To validate a string that contains lines that are all equal to test , you may use 要验证包含全部等于test行的字符串,可以使用

Pattern.compile("^test(?:\\Rtest)*$")

In older Java versions, you will need to replace \\R (any line break) with \\n or \\r?\\n . 在较早的Java版本中,您将需要用\\n\\r?\\n替换\\R (任何换行符)。

See online demo : 观看在线演示

Pattern pattern = Pattern.compile("^test(?:\\Rtest)*$");
Matcher matcher = pattern.matcher("test\ntest");
System.out.println(matcher.matches()); // => true

Pattern.MULTILINE allows your regex to match ^ and $ before and after a line separator, which isn't the default behaviour. Pattern.MULTILINE允许您的正则表达式在行分隔符之前和之后匹配^$ ,这不是默认行为。 The default is to match only on the beginning and end of the input. 默认设置是仅在输入的开头和结尾匹配。

However, if you use matches() it tries to match the the regex against the whole input text producing a false, because the input isn't equal to just "test" . 但是,如果使用matchs(),它将尝试将正则表达式与整个输入文本进行匹配,从而产生false,因为输入不仅等于"test"

Although matches() doesn't work, you can use find() to find a subsequence of the input matching the regex. 尽管matchs()不起作用,但是您可以使用find()查找与正则表达式匹配的输入的子序列。 Because ^ and $ match before and after \\n , your pattern finds two subsequences. 因为^$\\n之前和之后匹配,所以您的模式会找到两个子序列。

But that's just my two cents. 但这只是我的两分钱。

Pattern pattern = Pattern.compile("^test$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher("test\ntest");

System.out.println(matcher.matches()); // prints "false", the whole input doesn't match a single "test"

System.out.println(matcher.find());    // prints "true"
System.out.println(matcher.group());   // prints "test"

System.out.println(matcher.find());    // prints "true"
System.out.println(matcher.group());   // prints "test"

System.out.println(matcher.find());    // prints "false"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM