I'm having trouble with a regex, I need search and remove the pattern matching the regex, when found I need to trim out. I wrote a regex like that
regex='(.*)((aa[[:space:]]bb)|(awd)|(bab)|(bc[[:space:]]d))(*.)'
in which I define all the beginning (1), the portion in which can be the target (2) and all the ending (3). It's easy with simple regex like (. )(abc)(. ) string="abc"; regex='( .)(abc)(. )'
[[ $string =~ $regex) && myvar=${BASH_REMATCH[2]} && buffer=${BASH_REMATCH[1]}${BASH_REMATCH[3]}
The trouble begin when I define a regex with nested parens and OR groups, like the first regex posted here. This is a sample from my shell:
$ string=" foo bar baz bac"
$ regex='(.*)((hello[[:space:]]world)|(example)|(funk[[:space:]]you)|(bar[[:space:]]baz))(.*)'
$ [[ $string =~ $regex ]] && echo ${BASH_REMATCH[1]}
foo
$ [[ $string =~ $regex ]] && echo ${BASH_REMATCH[2]}
bar baz
$ [[ $string =~ $regex ]] && echo ${BASH_REMATCH[3]}
$ [[ $string =~ $regex ]] && echo ${BASH_REMATCH[4]}
$ [[ $string =~ $regex ]] && echo ${BASH_REMATCH[5]}
$ [[ $string =~ $regex ]] && echo ${BASH_REMATCH[6]}
bar baz
$ [[ $string =~ $regex ]] && echo ${BASH_REMATCH[7]}
bac
$ [[ $string =~ $regex ]] && echo ${BASH_REMATCH[@]}
foo bar baz bac foo bar baz bar baz bac
The matching have a strange behaviour, I don't find the other portion of the input string in ${BASH_REMATCH[3]} although is in the 3rd parens of the regex. What's happen with nested parens?
bash
assigns numbers to the capture groups based on a left-to-right ordering of the opening parentheses. Basically, it's a depth-first ordering, not breadth-first like you are assuming.
1. (.*)
2. (
3. (hello[[:space:]]world)|
4. (example)|
5. (funk[[:space:]]you)|
6. (bar[[:space:]]baz)
)
7. (.*)
In this regular expression, group 2 is essentially a copy of whichever of groups 3, 4, 5 or 6 actually matches, since group 2 contains nothing else. Group 7 is what you think of as the 3rd parenthesis group.
Group 0 is the entire match, which explains your last line using @
:
$ [[ $string =~ $regex ]] && echo ${BASH_REMATCH[@]}
foo bar baz bac foo bar baz bar baz bac
| | | | | | | | | |
+-------------+ +-+ +-----+ +-----+ +-+
0 1 2 6 7
(The empty groups 3, 4, and 5 are swallowed up as whitespace during word-splitting.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.