简体   繁体   中英

Java Regex, match pattern, pair of words

i am using regex to check correctness of the string in my application. I want to check if string has a following pattern: x=y&a=b&... x,y,a,b etc. can be empty.

Example of correct strings:

abc=def&gef=cda&pdf=cdf
=&gef=def
abc=&gef=def
=abc&gef=def

Example of incorrect strings:

abc=def&gef=cda&
abc=def&gef==cda&
abc=defgef=cda&abc=gda

This is my code showing current solution:

    String pattern = "[[a-zA-Z0-9]*[=]{1}[a-zA-Z0-9]*[&]{1}]*";
    if(!Pattern.matches(pattern, s)){
        throw new IllegalArgumentException(s);
    }

This solution is bad because it accepts strings like:

abc=def&gef=def&

Can anyone help me with correct pattern?

You may use the following regex:

^[a-zA-Z0-9]*=[a-zA-Z0-9]*(?:&[a-zA-Z0-9]*=[a-zA-Z0-9]*)*$

See the regex demo

When used with matches() , the ^ and $ anchors may be omitted.

Details :

  • ^ - start of string
  • [a-zA-Z0-9]* - 0+ alphanumeric chars (may be replaced with \\p{Alnum} )
  • = - a = symbol
  • [a-zA-Z0-9]* - 0+ alphanumeric chars
  • = - a = symbol
  • (?: - start of a non-capturing group matching sequences of...
    • & - a & symbol
    • [a-zA-Z0-9]*=[a-zA-Z0-9]* - same as above
  • )* - ... zero or more occurrences
  • $ - end of string

NOTE : If you want to make the pattern more generic, you may match any char other than = and & with a [^&=] pattern that would replace a more restrictive [a-zA-Z0-9] pattern:

^[^=&]*=[^=&]*(?:&[^=&]*=[^=&]*)*$

See this regex demo

Here you go:

^\w*=\w*(?:&(?:\w*=\w*))*$
  • ^ is the starting anchor
  • (\\w*=\\w*) is to represent parameters like abc=def
    • \\w matches a word character [a-zA-Z0-9_]
    • \\w* represents 0 or more characters
  • & represents tha actual ampersand literal
  • (&(\\w*=\\w*))* matches any subsequents parameters like &b=d etc.
  • $ represents the ending anchor

Regex101 Demo

EDIT: Made all groups non-capturing.

Note: As @WiktorStribiżew has pointed out in the comments, \\w will match _ as well, so above regex should be modified to exclude underscores if they are to be avoided in the pattern, ie [A-Za-z0-9]

I believe you want this.

([a-zA-Z0-9]*=[a-zA-Z0-9]*&)*[a-zA-Z0-9]*=[a-zA-Z0-9]*

This matches any number of repetitions like x=y , with a & after each one; followed by one repetition like x=y without the following & .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM