简体   繁体   中英

Get substring using either perl or sed

I can't seem to get a substring correctly.

declare BRANCH_NAME="bugfix/US3280841-something-duh";

# Trim it down to "US3280841"
TRIMMED=$(echo $BRANCH_NAME | sed -e 's/\(^.*\)\/[a-z0-9]\|[A-Z0-9]\+/\1/g')

That still returns bugfix/US3280841-something-duh .

If I try an use perl instead:

declare BRANCH_NAME="bugfix/US3280841-something-duh";

# Trim it down to "US3280841"
TRIMMED=$(echo $BRANCH_NAME | perl -nle 'm/^.*\/([a-z0-9]|[A-Z0-9])+/; print $1');

That outputs nothing.

What am I doing wrong?

Using bash parameter expansion only:

$: # don't use caps; see below.
$: declare branch="bugfix/US3280841-something-duh"
$: tmp="${branch##*/}"
$: echo "$tmp"
US3280841-something-duh
$: trimmed="${tmp%%-*}" 
$: echo "$trimmed"
US3280841

Which means:

$: tmp="${branch_name##*/}"
$: trimmed="${tmp%%-*}" 

does the job in two steps without spawning extra processes.

In sed ,

$: sed -E 's#^.*/([^/-]+)-.*$#\1#' <<< "$branch"

This says "after any or no characters followed by a slash, remember one or more that are not slashes or dashes, followed by a not-remembered dash and then any or no characters, then replace the whole input with the remembered part."

Your original pattern was

's/\(^.*\)\/[a-z0-9]\|[A-Z0-9]\+/\1/g'

This says "remember any number of anything followed by a slash, then a lowercase letter or a digit, then a pipe character (because those only work with -E), then a capital letter or digit, then a literal plus sign, and then replace it all with what you remembered."

GNU's manual is your friend. I look stuff up all the time to make sure I'm doing it right. Sometimes it still takes me a few tries, lol.

An aside - try not to use all-capital variable names. That is a convention that indicates it's special to the OS, like RANDOM or IFS.

You may use this sed :

sed -E 's~^.*/|-.*$~~g' <<< "$BRANCH_NAME"

US3280841

Ot this awk :

awk -F '[/-]' '{print $2}' <<< "$BRANCH_NAME"

US3280841
sed 's:[^/]*/\([^-]*\)-.*:\1:'<<<"bugfix/US3280841-something-duh"

Perl version just has + in wrong place. It should be inside the capture brackets:

TRIMMED=$(echo $BRANCH_NAME | perl -nle 'm/^.*\/([a-z0-9A-Z]+)/; print $1');

Just use a ^ before A-Z0-9

TRIMMED=$(echo $BRANCH_NAME | sed -e 's/\(^.*\)\/[a-z0-9]\|[^A-Z0-9]\+/\1/g')

in your sed case.

Alternatively and briefly, you can use

TRIMMED=$(echo $BRANCH_NAME | sed "s/[a-z\/\-]//g" )

too.

type on shell terminal

$ BRANCH_NAME="bugfix/US3280841-something-duh"

$ echo $BRANCH_NAME| perl -pe 's/.*\/(\w\w[0-9]+).+/\1/'

use s (substitute) command instead of m (match)
perl is a superset of sed so it'd be identical 'sed -E' instead of 'perl -pe'

使用Perl正则表达式字符类的另一个变体(参见perldoc perlrecharclass )。

echo $BRANCH_NAME | perl -nE 'say m/^.*\/([[:alnum:]]+)/;'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM