简体   繁体   English

java.util.regex.Pattern与在线regex调试器不同

[英]java.util.regex.Pattern doesn't agree with online regex debugger

I'm working with some regex for a program, I want the program to detect a certain exe, called gruell[something].exe 我正在为程序使用一些正则表达式,我希望该程序检测到一个特定的exe,称为gruell [something] .exe

So I ended up with the following regex: 所以我最终得到了以下正则表达式:

gruell.*\.exe[^\.]

After testing on both these sites my test cases are detected properly 在这两个站点上测试之后,我的测试用例被正确检测

My test set: (and what should fail and pass) 我的测试集:(以及哪些应该失败并通过)

  • gruell-Core.exe [PASS] gruell-Core.exe [通过]
  • Gruell.exe [PASS] Gruell.exe [通过]
  • gruell_x64.exe [PASS] gruell_x64.exe [通过]
  • Gruell_x64-core.exe [PASS] Gruell_x64-core.exe [通过]
  • grull.exe [FAIL] grull.exe [失败]
  • gruell_____.exe [PASS] gruell _____。exe [通过]
  • gruell_installer.msi [FAIL] gruell_installer.msi [失败]
  • gruell.html [FAIL] gruell.html [失败]
  • .gruell.exe.398sn [FAIL] .gruell.exe.398sn [失败]
  • gru-ell.exe [FAIL] gru-ell.exe [失败]

When I run this on my machine using the java.util.regex.Pattern it will not find anything, eventhough the folder I told it to scan contains both: 当我使用java.util.regex.Pattern在计算机上运行此文件时,即使我告诉它扫描的文件夹同时包含这两个文件,也不会找到任何内容:

  • gruell.exe gruell.exe
  • .gruell.exe.398sn .gruell.exe.398sn

Now the intersting part is is when I remove [^.] it will detect, however, it detects the .gruell.exe.398sn aswell, which is what I don't want. 现在最有趣的部分是当我删除[^。]时,它将检测到,但是,它也检测到.gruell.exe.398sn,这是我不想要的。

Code in question: 有问题的代码:

File f = new File("G:\\dev\\gruell");
recursive_scan(f);

The function: 功能:

for (file : location.listFiles())
{
    if (file.isDirectory)
    {
         recursive_scan(file)
    }
    else
    {
         Pattern pattern = Pattern.compile("gruell.*\\.exe[^\\.]", Pattern.CASE_INSENSITIVE);
         if (pattern.matcher(file.name).find())
         {
              System.out.println("FOUND: " + file.name);   
         }
     }
 }

After testing on both [regex101 and RegExr] my test cases are detected properly 在[regex101和RegExr]上测试之后,我的测试用例被正确检测到

That seems unlikely, since your pattern is indeed faulty, not only in Java's Regex dialect but also in the ones tested by those sites. 这似乎不太可能,因为您的模式确实有问题,不仅在Java的Regex方言中,而且在那些站点测试过的方言中也是如此。 The only plausible explanation I see is that you were not actually testing the cases you think you were. 我看到的唯一合理的解释是,您实际上并未在测试您认为是的情况。 For example, your test inputs may have had trailing spaces or newlines. 例如,您的测试输入可能有尾随空格或换行符。

Which brings me to the problem with your pattern. 这使我想到了您的模式问题。 As you already observe, 如您所见,

Now the intersting part is is when I remove [^.] it will detect, 现在最有趣的部分是当我删除[^。]时,它将检测到,

That's because that sub-expression matches a character (different from . ). 这是因为该子表达式与一个字符匹配(不同于. )。 Your overall pattern therefore indeed does not match "gruell-Core.exe" because there is no character after the .exe . 因此,您的总体模式确实与"gruell-Core.exe"不匹配,因为.exe后面没有字符。 Try matching "gruell-Core.exee" instead. 尝试匹配"gruell-Core.exee"

If you want your matches to end with .exe , then anchor your pattern instead: gruell.*\\.exe$ 如果您希望匹配以.exe结尾,请改为锚定模式: gruell.*\\.exe$

Alright thanks to the site provided by John Bollinger https://www.regexplanet.com/advanced/java/index.html I was able to find out 2 things that were wrong here. 好的,感谢约翰·博林格(John Bollinger)提供的网站https://www.regexplanet.com/advanced/java/index.html,我能够在这里找出两处错误的地方。

First off I had to use: 首先,我必须使用:

 pattern.matcher(file.name).matches()

Instead of what I had: 代替我的东西:

 pattern.matcher(file.name).find()

And second off I had to remove [^.] from the end of the String. 其次,我必须从字符串末尾删除[^。]。

From: 从:

"gruell.*\\.exe[^.]"

To: 至:

"gruell.*\\.exe"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM