简体   繁体   中英

Regex match character before and after underscore

I have to write a regex with matches following:

  • String should start with alphabets - [a-zA-Z]
  • String can contain alphabets, spaces, numbers, _ and - (underscore and hyphen)
  • String should not end with _ or - (underscore and hyphen)
  • Underscore character should not have space before and after.

I came up with the following regex, but it doesn't seems to work

/^[a-zA-Z0-9]+(\b_|_\b)[a-zA-Z0-9]+$/

Test case:

HelloWorld // Match
Hello_World //Match
Hello _World // doesn't match
Hello_ World // doesn't match
Hello _ World // doesn't match
Hello_World_1 // Match
He110_W0rld // Match
Hello - World // Match
Hello-World // Match
_HelloWorld // doesn't match
Hello_-_World // match

You may use

^(?!.*(?:[_-]$|_ | _))[a-zA-Z][\w -]*$

See the regex demo

Explanation :

  • ^ - start of string
  • (?!.*(?:[_-]$|_ | _)) - after some chars ( .* ) there must not appear ( (?!...) ) a _ or - at the end of string ( [_-]$ ), nor space+ _ or _ +space
  • [a-zA-Z] - the first char matched and consumed must be an ASCII letter
  • [\\w -]* - 0+ word ( \\w = [a-zA-Z0-9_] ) chars or space or -
  • $ - end of string

You could use this one:

^(?!^[ _-]|.*[ _-]$|.* _|.*_ )[\w -]*$

regex tester

For the test cases I used modifier gm to match each line individually.

If emtpy string should not be considered as acceptable, then change the final * to a + :

^(?!^[ _-]|.*[ _-]$|.* _|.*_ )[\w -]+$

Meaning of each part

  • ^ and $ match the beginning/ending of the input
  • (?! ) : list of things that should not match:
    • | : logical OR
    • ^[ _-] : starts with any of these three characters
    • .*[ _-]$ : ends with any of these three characters
    • .* _ : has space followed by underscore anywhere
    • .*_ : has underscore followed by space anywhere
  • [\\w -] : any alphanumeric character or underscore (also matched by \\w ) or space or hyphen
  • * : zero or more times
  • + : one or more times

What about this?

^[a-zA-Z](\B_\B|[a-zA-Z0-9 -])*[a-zA-Z0-9 ]$

Broken down:

^               
[a-zA-Z]        allowed characters at beginning
(
 \B_\B          underscore with no word-boundary
|                 or
 [a-zA-Z0-9 -]  other allowed characters
)*
[a-zA-Z0-9 ]    allowed characters at end
$

Oh! I love me some regex!

Would this work? /^[az]$|^[az](?:_(?=[^ ]))?(?:[az\\d -][^ ]_[^ ])*[az\\d -]*[^_-]$/i

I was a tad unsure of rule 4--do you mean underscores can have a space before or after or neither, but not before and after?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM