简体   繁体   中英

Split Java string on spaces with special characters and complications

I have an input string like :

-a  var1=Bat"m/an  -b   var2=" -a="lol "  -c  var3=" M^a%g-i=c "

After splitting I should get :

Output

- -a
- var1=Bat"m/an
- -b
- var2=" -a="lol "
- -c
- var3=" M^a%g-i=c "

Rules:

  • format is something like -(char)(atleast one space)variable=value
  • value can have any special characters except spaces ex. Bat"m/an
  • value can have spaces if in quotes ex. " -a="lol " or " M^a%gi=c "

I have written the regex but quotes inside quotes is messing it up :

(?:"[^"]*"|\S)+

Also I tried to parse character wise or split on =" but I'm facing ambiguity as they can be inside quotes too.

You may use this regex for matching with a lookahead assertion:

-?[a-z_]\w*(?:=".*?"(?=\h+(?:-[a-z](?=\h|$)|[a-z]\w*=)|$)|\S+)?

RegEx Demo

RegEx Explanation:

  • -? : Start with an optional hyphen
  • [a-z_]\\w* : match a variable that starts with a lowercase letter or underscore followed by 0+ word characters
  • (?: : Start non-capture group
    • ".*?"(?=...<expression>) : Match quoted string that starts and ends with double quote. Using lookahead we assert that we have another variable or end of line ahead.
    • | : OR
    • \\S+ : Match 1+ non-whitespace characters
  • ) : End non-capture group

You can try something as below:

(-[a-z]|[^\s][^\s]*="?[^"]*"?[^\s]*)

where all the parameters and their values will be captured as a separate group

Explanation:

Capturing Group (-[a-z]|[^\s][^\s]*="?[^"]*"?[^\s]*)
1st Alternative -[a-z]
2nd Alternative [^\s][^\s]*="?[^"]*"?[^\s]*
[^\s] - A character which should not be a space
[^\s]* - Matches all non space characters
= checks for equal to as mandatory
= matches the character = literally (case sensitive)
"? checks if " symbol is there
[^"]* checks for all symbols that are not as "
"? Again check for " as option
[^\s]* Finally again check for all non space characters

Demo Here

Hope that helps :) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM