简体   繁体   English

正确的正则表达式模式是什么?

[英]What is the correct regex pattern for this?

I was solving this question from Hackerrank 我正在从Hackerrank解决这个问题

https://www.hackerrank.com/contests/find-google/challenges/find-google/problem https://www.hackerrank.com/contests/find-google/challenges/find-google/problem

and came up with this pattern "^[gG][o0O()\\[\\]{}][o0O()\\[\\]{}][gG][lLI][eE3]" But this is giving wrong answer for test case g()()GI3. 并提出了这种模式“ ^ [gG] [o0O()\\ [\\] {}] [o0O()\\ [\\] {}] [gG] [lLI] [eE3]”但这给出了错误的答案测试用例g()()GI3。 Can anyone tell me the error? 谁能告诉我这个错误? Also tell me if there is more efficient expression for this. 还告诉我是否有更有效的表达方式。

    import java.util.regex.*;
    import java.io.*;
    import java.util.*;
    class Main {

public static void main (String[] args) {
    Scanner s = new Scanner(System.in);
    String str = s.next();
    Pattern pattern = Pattern.compile("^[gG][o0O()\\[\\]{}][o0O()\\[\\]{}][gG][lLI][eE3]",Pattern.CASE_INSENSITIVE);
    Matcher matcher = pattern.matcher(str);
    if(matcher.matches())
    System.out.println("YES");
    else System.out.println("NO");
}}

The problem with the current regex is that [] , () and <> are put inside a character class that matches a single char inside it, but () , <> and [] char sequences consist of 2 chars. 当前正则表达式的问题是[]()<>放在与之匹配的单个字符内的字符类中,但是()<>[]字符序列由2个字符组成。

You need to use a grouping construct with an alternation operator here to match o s. 您需要在此处使用带有交替运算符的分组构造来匹配o s。

You may use this pattern with Pattern.matches() : 您可以将此模式与Pattern.matches()

Pattern pattern = Pattern.compile("[gG](?:[oO0]|\(\)|\[]|<>){2}[gG][LlI][eE3]");

See the regex demo 正则表达式演示

Details 细节

  • [gG] - g or G [gG] gG
  • (?:[oO0]|\\(\\)|\\[]|<>){2} - two occurrences of (?:[oO0]|\\(\\)|\\[]|<>){2} -两次出现
    • [o0] - o , O or 0 [o0] oO0
    • | - or - 要么
    • \\(\\) - a () substring \\(\\) - ()子字符串
    • | - or - 要么
    • \\[] - a [] substring \\[] -一个[]子字符串
    • | - or - 要么
    • <> - a <> substring <> -一个<>子字符串
  • [gG] - g or G [gG] gG
  • [LlI] - l , L or I [LlI] lLI
  • [eE3] - e , E or 3 . [eE3] eE3

You can use this 你可以用这个

You do not need to add both uppercase and lowercase of a letter when you use case insensitive flag. 使用case insensitive标志时,无需同时添加字母的大写和小写字母。

g(?:o|0|<>|\[]){2}g[li][e3]
  • g - Matches g or G ( as case insensitive flag is on ) g匹配gG (不区分大小写的标记处于打开状态)
  • (?:o|0|<>|\\[]) - Matches o or 0 or <> or [] . (?:o|0|<>|\\[]) -匹配o0<>[]
  • [li] - Matches L or l or I or i . [li] -匹配LlIi
  • [e3] - Matches e or E or 3 [e3] -匹配eE3

Demo 演示版

I propose you this version of the regex string along with the code: 我向您建议此版本的regex字符串以及代码:

String string = bufferedReader.readLine();
String regex = "^[gG]([o0O]|\\(\\)|\\[\\]|\\{\\}|\\<\\>){2}[gG][lLI][eE3]";        
String result = Pattern.matches(regex, string) ? "True" : "False";
System.out.println(result);

Since it's a "one shot" search you can use the static method "matches" of the Pattern class. 由于这是一次“一次性”搜索,因此可以使用Pattern类的静态方法“ matches”。 That is, desume the result directly from a one-time match against your regex. 也就是说,直接从与您的正则表达式的一次匹配中得出结果。

The regex is mostly the same as the one you devised, but some points are worth noting. 正则表达式与您设计的正则表达式基本相同,但有几点值得注意。

It is dangerous (if not erroneous) using case insensitivity when your regex tries to pick up letters of differente case. 当您的正则表达式尝试提取不同大小写的字母时,使用不区分大小写是危险的(如果不是错误的话)。 If you want to do the matching the classic way, avoid using the flag of case insensitivity. 如果要以经典方式进行匹配,请避免使用不区分大小写的标志。 (in this particular case it would match a "google" written with a lower case "i" in place of a "L", which would lead to false positives). (在这种情况下,它将匹配以小写的“ i”代替“ L”的“ google”,这将导致误报)。

Since there is only one way to express the "o", it is better to group its definition in a single sub-expression and then use the quantifier " {2} " to say that you exactly want two instances of that subexpression to match. 由于只有一种表达“ o”的方法,因此最好将其定义分组在单个子表达式中,然后使用量词“ {2}”表示您确实希望该子表达式的两个实例匹配。

You want to find two occurrences of EITHER 您想找到两次EITHER

  • either a lower/upper case "o" 小写/大写“ o”
  • a zero
  • a pair of normal/square/curly/angular parentheses 正常/正方形/弯曲/圆角括号

Last but not least, if you are looking for simple, square, curly or angular braces, you must escape them because those are special characters in a regexp. 最后但并非最不重要的一点是,如果您要查找简单的,方括号,花括号或角括号,则必须将其转义,因为它们是正则表达式中的特殊字符。 Moreover, you must escape them the java way, using a double blackslash. 而且,您必须使用双斜杠以Java方式对它们进行转义。

Here is the full code: 这是完整的代码:

    public static void main(String[] args) throws IOException {
       BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(System.in));

       String string = bufferedReader.readLine();
       String regex = "^[gG]([o0O]|\\(\\)|\\[\\]|\\{\\}|\\<\\>){2}[gG][lLI][eE3]";        
       String result = Pattern.matches(regex, string) ? "True" : "False";
       System.out.println(result);

       bufferedReader.close();
}

It prints "True" when fed with the input "g()()GI3" 当输入“ g()()GI3”时,它将打印“ True”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM