简体   繁体   中英

Split a string to print first two characters delimited by “-” In Bash

I am listing the AWS region names.

us-east-1
ap-southeast-1

I want to split the string to print specific first characters delimited by - ie 'two characters'-'one character'-'one character'. So us-east-1 should be printed as use1 and ap-southeast-1 should be printed as aps1

I have tried this and it's giving me expected results. I was thinking if there is a shorter way to achieve this.

region=us-east-1 
regionlen=$(echo -n $region | wc -m) 
echo $region | sed 's/-//' | cut -c 1-3,expr $regionlen - 2-expr $regionlen - 1 

How about using sed :

echo "$region" | sed -E 's/^(.[^-]?)[^-]*-(.)[^-]*-(.).*$/\1\2\3/'

Explanation: the s/pattern/replacement/ command picks out the relevant parts of the region name, replacing the entire name with just the relevant bits. The pattern is:

^         - the beginning of the string
(.[^-]?)  - the first character, and another (if it's not a dash)
[^-]*     - any more things up to a dash
-         - a dash (the first one)
(.)       - The first character of the second word
[^-]*-    - the rest of the second word, then the dash
(.)       - The first character of the third word
.*$       - Anything remaining through the end

The bits in parentheses get captured, so \\1\\2\\3 pulls them out and replaces the whole thing with just those.

IFS influencing field splitting step of parameter expansion:

$ str=us-east-2
$ IFS=- eval 'set -- $str'
$ echo $#
3
$ echo $1
us
$ echo $2
east
$ echo $3

No external utilities; just processing in the language.

This is how smartly written build configuration scripts parse version numbers like 1.13.4 and architecture strings like i386-gnu-linux .

The eval can be avoided, if we save and restore IFS .

$ save_ifs=$IFS; set -- $str; IFS=$save_ifs

Using bash, and assuming that you need to distinguish between things like southwest and southeast:

s=ap-southwest-1

a=${s:0:2}
b=${s#*-}
b=${b%-*}
c=${s##*-}

bb=
case "$b" in
south*) bb+=s ;;&
north*) bb+=n ;;&
*east*) bb+=e ;;
*west*) bb+=w ;;
esac

echo "$a$bb$c"

How about:

region="us-east-1"
echo "$region" | (IFS=- read -r a b c; echo "$a${b:0:1}${c:0:1}")
use1

A simple sed -

$: printf "us-east-1\nap-southeast-1\n" |
     sed -E 's/-(.)[^-]*/\1/g'

To keep noncardinal specifications like southeast distinct from south at the cost of adding an optional additional character -

$: printf "us-east-1\nap-southeast-1\n" |
   sed -E '
    s/north/n/;
    s/south/s/;
    s/east/e/;
    s/west/w/;
    s/-//g;'

If you could have south-southwest , add g to those directional reductions.

if you MUST have exactly 4 characters of output, I recommend mapping the eight or 16 map directions to specific characters, so that north is N, northeast is maybe O and northwest M... that sort of thing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM