简体   繁体   English

使用perl或sed获取子字符串

[英]Get substring using either perl or sed

I can't seem to get a substring correctly. 我似乎无法正确获得子串。

declare BRANCH_NAME="bugfix/US3280841-something-duh";

# Trim it down to "US3280841"
TRIMMED=$(echo $BRANCH_NAME | sed -e 's/\(^.*\)\/[a-z0-9]\|[A-Z0-9]\+/\1/g')

That still returns bugfix/US3280841-something-duh . 那仍然会返回bugfix/US3280841-something-duh

If I try an use perl instead: 如果我尝试使用perl代替:

declare BRANCH_NAME="bugfix/US3280841-something-duh";

# Trim it down to "US3280841"
TRIMMED=$(echo $BRANCH_NAME | perl -nle 'm/^.*\/([a-z0-9]|[A-Z0-9])+/; print $1');

That outputs nothing. 没有输出。

What am I doing wrong? 我究竟做错了什么?

Using bash parameter expansion only: 仅使用bash参数扩展:

$: # don't use caps; see below.
$: declare branch="bugfix/US3280841-something-duh"
$: tmp="${branch##*/}"
$: echo "$tmp"
US3280841-something-duh
$: trimmed="${tmp%%-*}" 
$: echo "$trimmed"
US3280841

Which means: 意思是:

$: tmp="${branch_name##*/}"
$: trimmed="${tmp%%-*}" 

does the job in two steps without spawning extra processes. 在不产生额外流程的情况下分两步完成工作。

In sed , sed

$: sed -E 's#^.*/([^/-]+)-.*$#\1#' <<< "$branch"

This says "after any or no characters followed by a slash, remember one or more that are not slashes or dashes, followed by a not-remembered dash and then any or no characters, then replace the whole input with the remembered part." 这表示“在任何或没有字符后面跟一个斜线,记住一个或多个不是斜线或短划线,然后是一个未记住的破折号,然后是任何或没有字符,然后用记住的部分替换整个输入。”

Your original pattern was 你原来的模式是

's/\(^.*\)\/[a-z0-9]\|[A-Z0-9]\+/\1/g'

This says "remember any number of anything followed by a slash, then a lowercase letter or a digit, then a pipe character (because those only work with -E), then a capital letter or digit, then a literal plus sign, and then replace it all with what you remembered." 这说“记住任何数字后跟斜线,然后是小写字母或数字,然后是管道字符(因为那些只与-E一起使用),然后是大写字母或数字,然后是字面加号,然后用你记得的东西取而代之。“

GNU's manual is your friend. GNU的手册是你的朋友。 I look stuff up all the time to make sure I'm doing it right. 我一直在寻找东西,以确保我做得对。 Sometimes it still takes me a few tries, lol. 有时它还需要我几次尝试,哈哈。

An aside - try not to use all-capital variable names. 抛开 - 尽量不要使用全资本变量名。 That is a convention that indicates it's special to the OS, like RANDOM or IFS. 这是一个惯例,表明它对操作系统来说很特殊,比如RANDOM或IFS。

You may use this sed : 你可以使用这个sed

sed -E 's~^.*/|-.*$~~g' <<< "$BRANCH_NAME"

US3280841

Ot this awk : 这个awk

awk -F '[/-]' '{print $2}' <<< "$BRANCH_NAME"

US3280841
sed 's:[^/]*/\([^-]*\)-.*:\1:'<<<"bugfix/US3280841-something-duh"

Perl version just has + in wrong place. Perl版本只有+错误的地方。 It should be inside the capture brackets: 它应该在捕获括号内:

TRIMMED=$(echo $BRANCH_NAME | perl -nle 'm/^.*\/([a-z0-9A-Z]+)/; print $1');

Just use a ^ before A-Z0-9 只需在A-Z0-9之前使用^

TRIMMED=$(echo $BRANCH_NAME | sed -e 's/\(^.*\)\/[a-z0-9]\|[^A-Z0-9]\+/\1/g')

in your sed case. 在你的sed情况下。

Alternatively and briefly, you can use 或者,简单地说,您可以使用

TRIMMED=$(echo $BRANCH_NAME | sed "s/[a-z\/\-]//g" )

too. 太。

type on shell terminal 在shell终端上输入

$ BRANCH_NAME="bugfix/US3280841-something-duh"

$ echo $BRANCH_NAME| perl -pe 's/.*\/(\w\w[0-9]+).+/\1/'

use s (substitute) command instead of m (match) 用s(替换)命令代替m(匹配)
perl is a superset of sed so it'd be identical 'sed -E' instead of 'perl -pe' perl是sed的超集所以它是相同的'sed -E'而不是'perl -pe'

使用Perl正则表达式字符类的另一个变体(参见perldoc perlrecharclass )。

echo $BRANCH_NAME | perl -nE 'say m/^.*\/([[:alnum:]]+)/;'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM