从 grep/awk 结果中提取 substring？

Question

I have a grep command that finds rows in a file, passes those to awk and prints out the 1st and 15th columns.我有一个 grep 命令可以在文件中查找行，将这些行传递给 awk 并打印出第 1 列和第 15 列。

grep String1 /path/to/file.txt | grep string2 | awk -F ' ' '{print $1, $15}'

So far, so good.到目前为止，一切都很好。 This results in a list like this:这会产生如下列表：

2023-01-20 [text1]>
2023-01-22 [text2]>
2023-01-23 [text3]>
2023-01-25 [text4]>

Ideally, I'd like to add some regex to the awk command so that I get this:理想情况下，我想向 awk 命令添加一些正则表达式，以便我得到：

2023-01-20 text1
2023-01-22 text2
2023-01-23 text3
2023-01-25 text4

My searches have only returned how to use regex with awk to identify fields but not to extract a substring from the results.我的搜索只返回了如何使用带有 awk 的正则表达式来识别字段，但没有返回从结果中提取 substring。 Is this possible with awk or some other command?这可能与 awk 或其他命令有关吗？

Answer 1

One awk idea that combines the current code with the new requirement:一个awk想法将当前代码与新需求相结合：

awk -v s1="String1" -v s2="string2" '                               # feed both search strings in as awk variables "s1" and "s2"
$0~s1 && $0~s2 { print $1,substr($15,2,index($15,"]")-2) }          # if s1 and s2 are both present in the current line then print 1st field and 15th field (sans the "[" "]" wrappers)
' /path/to/file.txt

A non-sensical demo file:一个无意义的演示文件：

$ cat file.txt
a b c d e f g h i j k l m n o p q r s t u v w x y z
a string2 c d e f g h i j k l m n [old]> p q r s t u v String1 x y z
a b c d e f g h i j k l m n o p q r s t u v w x y z
a String1 c d e f g h i j k l m n [older]> p q r s t u v string2 x y z

Running the awk script against this file generates:针对此文件运行awk脚本会生成：

a old
a older

Answer 2

Another option removing the leading [ and trailing ]> with gsub and an alternation:使用 gsub 和交替删除前导[和尾随]>的另一种选择：

awk '/String1/ && /string2/ {
  gsub(/^\[|\]>$/, "", $15)
  {print $1, $15}
}' file.txt

In gnu-awk you could use gensub :在gnu-awk你可以使用gensub ：

awk '/String1/ && /string2/ {
  {print $1, gensub(/^\[|\]>$/, "", "g", $15)}
}' file

Or find the occurrence of the string using index:或者使用索引查找字符串的出现：

awk 'index($0, "String1") && index($0, "string2"){
  gsub(/^\[|\]>$/, "", $15)
  {print $1, $15}
}' file

Answer 3

If you're just basically want to delete the characters [ , ] and > , you can simply use tr -d for this, something like:如果您基本上只是想删除字符[ , ]和> ，您可以简单地使用tr -d ，例如：

... | tr -d "[]>"

Linux prompt>echo "2023-01-20 [text1]>" | tr -d "[]>"
2023-01-20 text1

从 grep/awk 结果中提取 substring？

问题描述

3 个解决方案

解决方案1
1 2023-01-25 23:46:37

解决方案2
0 2023-01-26 10:00:48

解决方案3
0 2023-01-26 10:36:18

从 grep/awk 结果中提取 substring？

问题描述

3 个解决方案

解决方案1 1 2023-01-25 23:46:37

解决方案2 0 2023-01-26 10:00:48

解决方案3 0 2023-01-26 10:36:18

解决方案1
1 2023-01-25 23:46:37

解决方案2
0 2023-01-26 10:00:48

解决方案3
0 2023-01-26 10:36:18