简体   繁体   中英

Regular Expression (RegEx) for User Name in Java

How to form the RegEx of user name string in Java?

Rules in Exercise :

  1. Only 3 - 10 characters.
  2. Only 'a'-'z', 'A'-'Z', '1'-'9', '_' and '.' are allowed.
  3. '_' and '.' can only be appeared 0 to 2 times.
  • "abc_._" = false
  • "abc..." = false
  • "abc__" = true
  • "abc.." = true
  • "abc_." = true

If I do not use Regex, it will be easier.


Without considering '1'-'9', I have tried the following RegEx but they are not workable.

String username_regex = "[a-zA-Z||[_||.]{0,2}]{3,10}";
String username_regex = "[a-zA-Z]{3,10}||[_||.]{0,2}";

My function :

public static boolean isUserNameCorrect(String user_name) {
    String username_regex = "[a-zA-Z||[_]{0,2}]{3,10}";
    boolean isMatch = user_name.matches(username_regex);
    return isMatch;
}

What RegEx should I use?

If I remember well from CS classes, it is not possible to create one single regex to satisfy all three requirements. So, I would make separate checks for each condintion. For example, this regex checks for conditions 1 and 2, and condition 3 is checked separately.

private static final Pattern usernameRegex = Pattern.compile("[a-zA-Z1-9._]{3,10}");

public static boolean isUserNameCorrect(String userName) {
    boolean isMatch = usernameRegex.matcher(userName).matches();
    return isMatch && countChar(userName, '.')<=2  && countChar(userName, '_') <=2;
}

public static int countChar(String s, char c) {
    int count = 0;
    int index = s.indexOf(c, 0);
    while ( index >= 0 ) {
        count++;
        index = s.indexOf(c, index+1);
    }
    return count;
}

BTW, notice the pattern that allows you to reuse a regex in Java (performace gain, because it is expensive to compile a regex).

The reason that a regex cannot do what you want (again if I remember well) is that this problem requires a context-free-grammar, while regex is a regular grammar. Ream more

First off, || isn't necessary for this problem, and in fact doesn't do what you think it does. I've only ever seen it used in groups for regex (like if you want to match Hello or World , you'd match (Hello|World) or (?:Hello|World) , and in those cases you only use a single | .


Next, let me explain why each of the regex you have tried won't work.

String username_regex = "[a-zA-Z||[_||.]{0,2}]{3,10}";

Range operators inside a character class aren't interpreted as range operators, and instead will just represent the literals that make up the range operators. In addition, nested character classes are simply combined. So this is effectively equal to:

String username_regex = "[a-zA-Z_|.{0,2}]{3,10}";

So it'll match some combination of 3-10 of the following: a - z , A - Z , 0 , 2 , { , } , . , | , and _ .

And that's not what you wanted.


String username_regex = "[a-zA-Z]{3,10}||[_||.]{0,2}";

This will match 3 to 10 of a - z or A - Z , followed by two pipes, followed by _ , | , or . 0 to 2 times. Also not what you wanted.


The easy way to do this is by splitting the requirements into two sections and creating two regex strings based off of those:

  1. Only 3 - 10 characters, where only 'a'-'z', 'A'-'Z', '1'-'9', '_' and '.' are allowed.
  2. '_' and '.' can only appear 0 to 2 times.

The first requirement is quite simple: we just need to create a character class including all valid characters and place limits on how many of those can appear:

"[a-zA-Z1-9_.]{3,10}"

Then I would validate that '_' and '.' appear 0 to 2 times:

".*[._].*[._].*"

or

"(?:.*[._].*){0,2}" // Might work, might not. Preferable to above regex if easy configuration is necessary. Might need reluctant quantifiers...

I'm unfortunately not experienced enough to figure out what a single regex would look like... But these are at least quite readable.

Please try this: [[aZ][0-9] [._]?[[aZ][0-9] [._]?[[aZ][0-9]*

Niko

EDIT : You're right. Then several Regexp : Regex1: ^[\\w.]{3-10}$ Regex2: ^[[aZ][0-9]] [_.]?[[aZ][0-9]] [_.]?[[aZ][0-9]]*$

I hope I forgot nothing!

May not be elegant but you may try this:

^(([A-Za-z0-9\._])(?!.*[\._].*[\._].*[\._])){3,10}$

Here is the explanation:

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  (                        group and capture to \1 (between 3 and 10
                           times (matching the most amount
                           possible)):
--------------------------------------------------------------------------------
    (                        group and capture to \2:
--------------------------------------------------------------------------------
      [A-Za-z0-9\._]           any character of: 'A' to 'Z', 'a' to
                               'z', '0' to '9', '\.', '_'
--------------------------------------------------------------------------------
    )                        end of \2
--------------------------------------------------------------------------------
    (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
      .*                       any character except \n (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      [\._]                    any character of: '\.', '_'
--------------------------------------------------------------------------------
      .*                       any character except \n (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      [\._]                    any character of: '\.', '_'
--------------------------------------------------------------------------------
      .*                       any character except \n (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      [\._]                    any character of: '\.', '_'
--------------------------------------------------------------------------------
    )                        end of look-ahead
--------------------------------------------------------------------------------
  ){3,10}                  end of \1 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \1)
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

This will satisfy your above-mentioned requirement. Hope it helps :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM