[英]Extract substring from grep/awk results?
I have a grep command that finds rows in a file, passes those to awk and prints out the 1st and 15th columns.我有一个 grep 命令可以在文件中查找行,将这些行传递给 awk 并打印出第 1 列和第 15 列。
grep String1 /path/to/file.txt | grep string2 | awk -F ' ' '{print $1, $15}'
So far, so good.到目前为止,一切都很好。 This results in a list like this:这会产生如下列表:
2023-01-20 [text1]>
2023-01-22 [text2]>
2023-01-23 [text3]>
2023-01-25 [text4]>
Ideally, I'd like to add some regex to the awk command so that I get this:理想情况下,我想向 awk 命令添加一些正则表达式,以便我得到:
2023-01-20 text1
2023-01-22 text2
2023-01-23 text3
2023-01-25 text4
My searches have only returned how to use regex with awk to identify fields but not to extract a substring from the results.我的搜索只返回了如何使用带有 awk 的正则表达式来识别字段,但没有返回从结果中提取 substring。 Is this possible with awk or some other command?这可能与 awk 或其他命令有关吗?
One awk
idea that combines the current code with the new requirement:一个awk
想法将当前代码与新需求相结合:
awk -v s1="String1" -v s2="string2" ' # feed both search strings in as awk variables "s1" and "s2"
$0~s1 && $0~s2 { print $1,substr($15,2,index($15,"]")-2) } # if s1 and s2 are both present in the current line then print 1st field and 15th field (sans the "[" "]" wrappers)
' /path/to/file.txt
A non-sensical demo file:一个无意义的演示文件:
$ cat file.txt
a b c d e f g h i j k l m n o p q r s t u v w x y z
a string2 c d e f g h i j k l m n [old]> p q r s t u v String1 x y z
a b c d e f g h i j k l m n o p q r s t u v w x y z
a String1 c d e f g h i j k l m n [older]> p q r s t u v string2 x y z
Running the awk
script against this file generates:针对此文件运行awk
脚本会生成:
a old
a older
Another option removing the leading [
and trailing ]>
with gsub and an alternation:使用 gsub 和交替删除前导[
和尾随]>
的另一种选择:
awk '/String1/ && /string2/ {
gsub(/^\[|\]>$/, "", $15)
{print $1, $15}
}' file.txt
In gnu-awk
you could use gensub :在gnu-awk
你可以使用gensub :
awk '/String1/ && /string2/ {
{print $1, gensub(/^\[|\]>$/, "", "g", $15)}
}' file
Or find the occurrence of the string using index:或者使用索引查找字符串的出现:
awk 'index($0, "String1") && index($0, "string2"){
gsub(/^\[|\]>$/, "", $15)
{print $1, $15}
}' file
If you're just basically want to delete the characters [
, ]
and >
, you can simply use tr -d
for this, something like:如果您基本上只是想删除字符[
, ]
和>
,您可以简单地使用tr -d
,例如:
... | tr -d "[]>"
Linux prompt>echo "2023-01-20 [text1]>" | tr -d "[]>"
2023-01-20 text1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.