简体   繁体   English

Bash:在数字之后提取字符串的一部分

[英]Bash: extract a part of a string, after a number

I have a few strings like this:我有几个这样的字符串:

var1="string one=3423423 and something which i don't care"
var2="another bigger string=413145 and something which i don't care"
var3="the longest string ever=23442 and something which i don't care"

These strings are the output of a python script (which i am not allowed to touch), and I need a way to extract the 1st part of the string, right after the number.这些字符串是 python 脚本的输出(我不允许触摸),我需要一种方法来提取字符串的第一部分,就在数字之后。 Basically, my outputs should be:基本上,我的输出应该是:

"string one=3423423"
"another bigger string=413145"
"the longest string ever=23442"

As you can see, i can't use positions, or stuff like that, because the number and the string length are not always the same.如您所见,我不能使用位置或类似的东西,因为数字和字符串长度并不总是相同的。 I assume i would need to use a regex or something, but i don't really understand regexes.我想我需要使用正则表达式或其他东西,但我不太了解正则表达式。 Can you please help with a command or something which can do this?你能帮忙提供一个命令或可以做到这一点的东西吗?

grep -oP '^.*?=\d+' inputfile
string one=3423423
another bigger string=413145
the longest string ever=23442

Here -o flag will enable grep to print only matching part and -p will enable perl regex in grep .这里-o标志将使grep仅打印匹配的部分, -p将启用grep perl正则表达式。 Here \\d+ means one or more digit.这里\\d+表示一位或多位数字。 So, ^.*?=\\d+ means print from start of the line till you find last digit (first match).因此, ^.*?=\\d+表示从行首打印,直到找到最后一位数字(第一个匹配项)。

You could use parameter expansion, for example:您可以使用参数扩展,例如:

var1="string one=3423423 and something which i don't care"
name=${var1%%=*}
value=${var1#*=}
value=${value%%[^0-9]*}
echo "$name=$value"
# prints: string one=3423423

Explanation of ${var1%%=*} : ${var1%%=*}

  • %% - remove the longest matching suffix %% - 删除最长的匹配后缀
  • = - match = = - 匹配=
  • * - match everything * - 匹配一切

Explanation of ${var1#*=} : ${var1#*=}

  • # - remove the shortest matching prefix # - 删除最短的匹配前缀
  • * - match everything * - 匹配一切
  • = - match = = - 匹配=

Explanation of ${value%%[^0-9]*} : ${value%%[^0-9]*}

  • %% - remove the longest matching suffix %% - 删除最长的匹配后缀
  • [^0-9] - match any non-digit [^0-9] - 匹配任何非数字
  • * - match everything * - 匹配一切

To perform the same thing on more than one values easily, you could wrap this logic into a function:要轻松地对多个值执行相同的操作,您可以将此逻辑包装到一个函数中:

extract_and_print() {
    local input=$1
    local name=${input%%=*}
    local value=${input#*=}
    value=${value%%[^0-9]*}
    echo "$name=$value"
}

extract_and_print "$var1"
extract_and_print "$var2"
extract_and_print "$var3"
$ shopt -s extglob

$ echo "${var1%%+([^0-9])}"
string one=3423423

$ echo "${var2%%+([^0-9])}"
another bigger string=413145

$ echo "${var3%%+([^0-9])}"
the longest string ever=23442

+([^0-9]) is an extended pattern that matches one or more non-digits. +([^0-9])是匹配一个或多个非数字的扩展模式。
${var%%+([^0-9])} with %%pattern will remove the longest match of that pattern from the end of the variable value. ${var%%+([^0-9])}%%pattern将从变量值的末尾删除该%%pattern的最长匹配项。

Refs: patterns , parameter substitution参考: patterns参数替换

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM