简体   繁体   中英

How to split string according to regex in bash script

I have such a string:

msg='123abc456def'

Now I need to split msg and get the result as below:

['123', 'abc', '456', 'def']

In python, I can do like this:

pattern = re.compile(r'(\d+)')
res = pattern.split(msg)[1:]

How to get the same result in bash script?
I've tried like this but it doesn't work:

IFS='[0-9]'    # how to define IFS with regex?
echo ${msg[@]}

Getting the substrings with grep , and putting the output in an array using command substitution:

$ msg='123abc456def'

$ out=( $(grep -Eo '[[:digit:]]+|[^[:digit:]]+' <<<"$msg") )

$ echo "${out[0]}"
123

$ echo "${out[1]}"
abc

$ echo "${out[@]}"
123 abc 456 def
  • The Regex (ERE) pattern [[:digit:]]+|[^[:digit:]]+ matches one or more digits ( [[:digit:]]+ ) OR ( | ) one or more non-digits ( [^[:digit:]]+ .

Given that you already know how to solve this in Python, you can solve it using the code shown in the question:

MSG=123abc456def;
python -c "import re; print('\n'.join(re.split(r'(\\d+)', '${MSG}')[1:]))"

While python is not as standard of an executable as say grep or awk , does that really matter to you?

I would do matching instead of splitting. Here, I used grep but you can use the same regex in pure bash also.

$ msg='123abc456def'
$ grep -oE '[0-9]+|[^0-9]+' <<<$msg
123
abc
456
def

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM