使用perl或sed获取子字符串

Question

I can't seem to get a substring correctly. 我似乎无法正确获得子串。

declare BRANCH_NAME="bugfix/US3280841-something-duh";

# Trim it down to "US3280841"
TRIMMED=$(echo $BRANCH_NAME | sed -e 's/\(^.*\)\/[a-z0-9]\|[A-Z0-9]\+/\1/g')

That still returns bugfix/US3280841-something-duh . 那仍然会返回bugfix/US3280841-something-duh 。

If I try an use perl instead: 如果我尝试使用perl代替：

declare BRANCH_NAME="bugfix/US3280841-something-duh";

# Trim it down to "US3280841"
TRIMMED=$(echo $BRANCH_NAME | perl -nle 'm/^.*\/([a-z0-9]|[A-Z0-9])+/; print $1');

That outputs nothing. 没有输出。

What am I doing wrong? 我究竟做错了什么？

Answer 1

Using bash parameter expansion only: 仅使用bash参数扩展：

$: # don't use caps; see below.
$: declare branch="bugfix/US3280841-something-duh"
$: tmp="${branch##*/}"
$: echo "$tmp"
US3280841-something-duh
$: trimmed="${tmp%%-*}" 
$: echo "$trimmed"
US3280841

Which means: 意思是：

$: tmp="${branch_name##*/}"
$: trimmed="${tmp%%-*}"

does the job in two steps without spawning extra processes. 在不产生额外流程的情况下分两步完成工作。

In sed , 在sed ，

$: sed -E 's#^.*/([^/-]+)-.*$#\1#' <<< "$branch"

This says "after any or no characters followed by a slash, remember one or more that are not slashes or dashes, followed by a not-remembered dash and then any or no characters, then replace the whole input with the remembered part." 这表示“在任何或没有字符后面跟一个斜线，记住一个或多个不是斜线或短划线，然后是一个未记住的破折号，然后是任何或没有字符，然后用记住的部分替换整个输入。”

Your original pattern was 你原来的模式是

's/\(^.*\)\/[a-z0-9]\|[A-Z0-9]\+/\1/g'

This says "remember any number of anything followed by a slash, then a lowercase letter or a digit, then a pipe character (because those only work with -E), then a capital letter or digit, then a literal plus sign, and then replace it all with what you remembered." 这说“记住任何数字后跟斜线，然后是小写字母或数字，然后是管道字符（因为那些只与-E一起使用），然后是大写字母或数字，然后是字面加号，然后用你记得的东西取而代之。“

GNU's manual is your friend. GNU的手册是你的朋友。 I look stuff up all the time to make sure I'm doing it right. 我一直在寻找东西，以确保我做得对。 Sometimes it still takes me a few tries, lol. 有时它还需要我几次尝试，哈哈。

An aside - try not to use all-capital variable names. 抛开 - 尽量不要使用全资本变量名。 That is a convention that indicates it's special to the OS, like RANDOM or IFS. 这是一个惯例，表明它对操作系统来说很特殊，比如RANDOM或IFS。

Answer 2

You may use this sed : 你可以使用这个sed ：

sed -E 's~^.*/|-.*$~~g' <<< "$BRANCH_NAME"

US3280841

Ot this awk : 这个awk ：

awk -F '[/-]' '{print $2}' <<< "$BRANCH_NAME"

US3280841

Answer 3

sed 's:[^/]*/\([^-]*\)-.*:\1:'<<<"bugfix/US3280841-something-duh"

Answer 4

Perl version just has + in wrong place. Perl版本只有+错误的地方。 It should be inside the capture brackets: 它应该在捕获括号内：

TRIMMED=$(echo $BRANCH_NAME | perl -nle 'm/^.*\/([a-z0-9A-Z]+)/; print $1');

Answer 5

Just use a ^ before A-Z0-9 只需在A-Z0-9之前使用^

TRIMMED=$(echo $BRANCH_NAME | sed -e 's/\(^.*\)\/[a-z0-9]\|[^A-Z0-9]\+/\1/g')

in your sed case. 在你的sed情况下。

Alternatively and briefly, you can use 或者，简单地说，您可以使用

TRIMMED=$(echo $BRANCH_NAME | sed "s/[a-z\/\-]//g" )

too. 太。

Answer 6

type on shell terminal 在shell终端上输入

$ BRANCH_NAME="bugfix/US3280841-something-duh"

$ echo $BRANCH_NAME| perl -pe 's/.*\/(\w\w[0-9]+).+/\1/'

use s (substitute) command instead of m (match) 用s（替换）命令代替m（匹配）
perl is a superset of sed so it'd be identical 'sed -E' instead of 'perl -pe' perl是sed的超集所以它是相同的'sed -E'而不是'perl -pe'

Answer 7

使用Perl正则表达式字符类的另一个变体（参见perldoc perlrecharclass ）。

echo $BRANCH_NAME | perl -nE 'say m/^.*\/([[:alnum:]]+)/;'

使用perl或sed获取子字符串

问题描述

7 个解决方案

解决方案1
6 2019-03-20 18:47:31

解决方案2
1 已采纳 2019-03-20 18:42:05

解决方案3
1 2019-03-20 18:44:15

解决方案4
1 2019-03-20 18:47:36

解决方案5
0 2019-03-20 18:48:41

解决方案6
0 2019-04-07 23:13:28

解决方案7
0 2019-04-27 06:33:58

使用perl或sed获取子字符串

问题描述

7 个解决方案

解决方案1 6 2019-03-20 18:47:31

解决方案2 1 已采纳 2019-03-20 18:42:05

解决方案3 1 2019-03-20 18:44:15

解决方案4 1 2019-03-20 18:47:36

解决方案5 0 2019-03-20 18:48:41

解决方案6 0 2019-04-07 23:13:28

解决方案7 0 2019-04-27 06:33:58

解决方案1
6 2019-03-20 18:47:31

解决方案2
1 已采纳 2019-03-20 18:42:05

解决方案3
1 2019-03-20 18:44:15

解决方案4
1 2019-03-20 18:47:36

解决方案5
0 2019-03-20 18:48:41

解决方案6
0 2019-04-07 23:13:28

解决方案7
0 2019-04-27 06:33:58