[英]Bash: extract a part of a string, after a number
I have a few strings like this:我有几个这样的字符串:
var1="string one=3423423 and something which i don't care"
var2="another bigger string=413145 and something which i don't care"
var3="the longest string ever=23442 and something which i don't care"
These strings are the output of a python script (which i am not allowed to touch), and I need a way to extract the 1st part of the string, right after the number.这些字符串是 python 脚本的输出(我不允许触摸),我需要一种方法来提取字符串的第一部分,就在数字之后。 Basically, my outputs should be:
基本上,我的输出应该是:
"string one=3423423"
"another bigger string=413145"
"the longest string ever=23442"
As you can see, i can't use positions, or stuff like that, because the number and the string length are not always the same.如您所见,我不能使用位置或类似的东西,因为数字和字符串长度并不总是相同的。 I assume i would need to use a regex or something, but i don't really understand regexes.
我想我需要使用正则表达式或其他东西,但我不太了解正则表达式。 Can you please help with a command or something which can do this?
你能帮忙提供一个命令或可以做到这一点的东西吗?
grep -oP '^.*?=\d+' inputfile
string one=3423423
another bigger string=413145
the longest string ever=23442
Here -o
flag will enable grep
to print only matching part and -p
will enable perl
regex in grep
.这里
-o
标志将使grep
仅打印匹配的部分, -p
将启用grep
perl
正则表达式。 Here \\d+
means one or more digit.这里
\\d+
表示一位或多位数字。 So, ^.*?=\\d+
means print from start of the line till you find last digit (first match).因此,
^.*?=\\d+
表示从行首打印,直到找到最后一位数字(第一个匹配项)。
You could use parameter expansion, for example:您可以使用参数扩展,例如:
var1="string one=3423423 and something which i don't care"
name=${var1%%=*}
value=${var1#*=}
value=${value%%[^0-9]*}
echo "$name=$value"
# prints: string one=3423423
Explanation of ${var1%%=*}
: ${var1%%=*}
:
%%
- remove the longest matching suffix %%
- 删除最长的匹配后缀=
- match =
=
- 匹配=
*
- match everything *
- 匹配一切Explanation of ${var1#*=}
: ${var1#*=}
:
#
- remove the shortest matching prefix #
- 删除最短的匹配前缀*
- match everything *
- 匹配一切=
- match =
=
- 匹配=
Explanation of ${value%%[^0-9]*}
: ${value%%[^0-9]*}
:
%%
- remove the longest matching suffix %%
- 删除最长的匹配后缀[^0-9]
- match any non-digit [^0-9]
- 匹配任何非数字*
- match everything *
- 匹配一切To perform the same thing on more than one values easily, you could wrap this logic into a function:要轻松地对多个值执行相同的操作,您可以将此逻辑包装到一个函数中:
extract_and_print() {
local input=$1
local name=${input%%=*}
local value=${input#*=}
value=${value%%[^0-9]*}
echo "$name=$value"
}
extract_and_print "$var1"
extract_and_print "$var2"
extract_and_print "$var3"
$ shopt -s extglob
$ echo "${var1%%+([^0-9])}"
string one=3423423
$ echo "${var2%%+([^0-9])}"
another bigger string=413145
$ echo "${var3%%+([^0-9])}"
the longest string ever=23442
+([^0-9])
is an extended pattern that matches one or more non-digits. +([^0-9])
是匹配一个或多个非数字的扩展模式。
${var%%+([^0-9])}
with %%pattern
will remove the longest match of that pattern from the end of the variable value. ${var%%+([^0-9])}
与%%pattern
将从变量值的末尾删除该%%pattern
的最长匹配项。
Refs: patterns , parameter substitution参考: patterns , 参数替换
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.