简体   繁体   中英

Split on multiple newline characters in bash

I need to split on regex pattern of 2 or more newlines and store each of the matching group as element of an array in bash. awk and sed didn't help because they work on a single line at once. My input string contains multiline texts. How could I do that?

A solution. The tab used below can be substituted by another character not contained in the file.

str=$(cat newlines.dat)                 # read file into string

str=${str//$'\n'$'\n'/$'\t'}            # 2 newlines to 1 tab

while [[ "$str" =~ $'\t'$'\n' ]] ; do
  str=${str//$'\t'$'\n'/$'\t'}          # eat up further newlines
done

str=${str//$'\t'$'\t'/$'\t'}            # sqeeze tabs

IFS=$'\t'                               # field separator is now tab
result=( $str )                         # slit into array

cnt=0
for x in ${result[@]}; do               # print result
  ((cnt++))
  echo -e "--- group $cnt ---\n$x"
done

The input file:

1111111111
222222222

33333333333
44444444444


5555555555555



66666666666666
77777777


888888888888888
999999

The result:

--- group 1 ---
1111111111
222222222
--- group 2 ---
33333333333
44444444444
--- group 3 ---
5555555555555
--- group 4 ---
66666666666666
77777777
--- group 5 ---
888888888888888
999999

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM