简体   繁体   中英

Regex to match if string *only* contains *all* characters from a character set, plus an optional one

I ran into a wee problem with Java regex. (I must say in advance, I'm not very experienced in either Java or regex.)

I have a string, and a set of three characters. I want to find out if the string is built from only these characters. Additionally (just to make it even more complicated), two of the characters must be in the string, while the third one is **optional*.

I do have a solution, my question is rather if anyone can offer anything better/nicer/more elegant, because this makes me cry blood when I look at it...

The set-up

  • There mandatory characters are: | (pipe) and - (dash).

    The string in question should be built from a combination of these. They can be in any order, but both have to be in it .

  • The optional character is: : (colon).

    The string can contain colons, but it does not have to . This is the only other character allowed, apart from the above two.

  • Any other characters are forbidden .

Expected results

Following strings should work/not work:

"------" = false
"||||" = false
"---|---" = true
"|||-|||" = true
"--|-|--|---|||-" = true

...and...

"----:|--|:::|---::|" = true
":::------:::---:---" = false
"|||:|:::::|" = false
"--:::---|:|---G---n" = false

...etc.

The "ugly" solution

Now, I have a solution that seems to work, based on this stackoverflow answer . The reason I'd like a better one will become obvious when you've recovered from seeing this:

if (string.matches("^[(?\\:)?\\|\\-]*(([\\|\\-][(?:\\:)?])|([(?:\\:)?][\\|\\-]))[(?\\:)?\\|\\-]*$") || string.matches("^[(?\\|)?\\-]*(([\\-][(?:\\|)?])|([(?:\\|)?][\\-]))[(?\\|)?\\-]*$")) {

    //do funny stuff with a meaningless string

} else {

   //don't do funny stuff with a meaningless string

}

Breaking it down

The first regex

 "^[(?\\:)?\\|\\-]*(([\\|\\-][(?:\\:)?])|([(?:\\:)?][\\|\\-]))[(?\\:)?\\|\\-]*$"

checks for all three characters

The next one

"^[(?\\|)?\\-]*(([\\-][(?:\\|)?])|([(?:\\|)?][\\-]))[(?\\|)?\\-]*$"

check for the two mandatory ones only.

...Yea, I know...

But believe me I tried. Nothing else gave the desired result, but allowed through strings without the mandatory characters, etc.

The question is...

Does anyone know how to do it a simpler / more elegant way?

Bonus question : There is one thing I don't quite get in the regexes above (more than one, but this one bugs me the most):

As far as I understand(?) regular expressions, (?\\\\|)? should mean that the character | is either contained or not (unless I'm very much mistaken), still in the above setup it seems to enforce that character. This of course suits my purpose, but I cannot understand why it works that way.

So if anyone can explain, what I'm missing there, that'd be real great, besides, this I suspect holds the key to a simpler solution (checking for both mandatory and optional characters in one regex would be ideal.

Thank you all for reading (and suffering ) through my question, and even bigger thanks for those who reply. :)

PS

I did try stuff like ^[\\\\|\\\\-(?:\\\\:)?)]$ , but that would not enforce all mandatory characters.

Use a lookahead based regex.

^(?=.*\\|)(?=.*-)[-:|]+$

or

^(?=.*\\|)[-:|]*-[-:|]*$

or

^[-:|]*(?:-:*\\||\\|:*-)[-:|]*$

DEMO 1
DEMO 2

  • (?=.*\\\\|) expects atleast one pipe.
  • (?=.*-) expects atleast one hyphen.
  • [-:|]+ any char from the list one or more times.
  • $ End of the line.

Here is a simple answer:

(?=.*\|.*-|.*-.*\|)^([-|:]+)$

This says that the string needs to have a '-' followed by '|', or a '|' followed by a '-', via the look-ahead. Then the string only matches the allowed characters.

Demo: http://fiddle.re/1hnu96

Here is one without lookbefore and -hind.

 ^[-:|]*\\|[-:|]*-[-:|]*|[-:|]*-[-:|]*\\|[-:|]*$

This doesn't scale, so Avinash's solution is to be preferred - if your regex system has the lookbe*.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM