使用bash从文本中提取正则表达式组并输出到文件

Question

I need to scan a log file and extract the relevant parts from it to another file. 我需要扫描一个日志文件，并将相关部分从其中提取到另一个文件中。 The log format is: 日志格式为：

   [hh:mm:ss] Header
   [hh:mm:ss] irrelevant text
   [hh:mm:ss] irrelevant text
   [hh:mm:ss]Error text
   [hh:mm:ss] some details
   [hh:mm:ss] end_error;
   [hh:mm:ss] irrelevant text
   [hh:mm:ss] Warning text
   [hh:mm:ss] some details
   [hh:mm:ss] end_warning;
   [hh:mm:ss] irrelevant text
   [hh:mm:ss] irrelevant text
   [hh:mm:ss]Error text
   [hh:mm:ss] some details
   [hh:mm:ss] end_error;

I need to get all occurrences of Error and Warning and capture the following text: 我需要获取所有出现的错误和警告并捕获以下文本：

[hh:mm:ss]Error text
[hh:mm:ss] some details
[hh:mm:ss] end_error;
[hh:mm:ss] Warning text
[hh:mm:ss] some details
[hh:mm:ss] end_warning;
[hh:mm:ss]Error text
[hh:mm:ss] some details
[hh:mm:ss] end_error;

What is the simplest way to achieve this on bash? 在bash上实现此目标的最简单方法是什么？

Answer 1

$ awk '/^(Error|Warning)/{f=1} f; /;/{f=0}' file
Error text
end_error;
Warning text
end_warning;

Your original input file showed Error and Warning at the start of each line so my script above has a start-of-line anchor (^) in it. 您的原始输入文件在每行的开头显示了错误和警告，因此上面的脚本在其中包含行首锚（^）。 Using your latest posted sample input file and desired output you'd need: 使用最新发布的样本输入文件和所需的输出，您需要：

$ awk '
   /^[[:space:]]*\[[^]]+\][[:space:]]*(Error|Warning)/ { found=1 }
   found { sub(/^[[:space:]]+/,""); print }
   /;/ { found=0 }
' file
[hh:mm:ss]Error text
[hh:mm:ss] some details
[hh:mm:ss] end_error;
[hh:mm:ss] Warning text
[hh:mm:ss] some details
[hh:mm:ss] end_warning;
[hh:mm:ss]Error text
[hh:mm:ss] some details
[hh:mm:ss] end_error;

The complexity of the regexp is to avoid false matches if the words Error or Warning appear elsewhere in your input file. 如果输入文件中其他地方出现错误或警告字样，则regexp的复杂性是为了避免错误匹配。

Answer 2

Using GNU sed range operator with -n and -r option to suppress default printing and enabling extended regular expression respectively. 将GNU sed范围运算符与-n和-r选项一起使用可分别禁止默认打印并启用扩展的正则表达式。 p flag prints the line that matches the condition. p标志将打印符合条件的行。

$ sed -nr '/^(Error|Warning)/,/;/p' file
Error text
end_error;
Warning text
end_warning;

You can do the same in awk too. 您也可以在awk执行相同的操作。 But using Ed's approach is almost always recommended. 但是几乎总是建议使用Ed的方法。

$ awk '/^(Error|Warning)/,/;/' file
Error text
end_error;
Warning text
end_warning;

Answer 3

Try: 尝试：

cat file | awk '/^(Error|Warning)/,/;$/ { print $0 }' > output

This will pipe the file through awk, awk will print lines starting with Error or Warning up to the first line ending with ; 这将通过awk传送文件，awk将打印以Error或Warning开头的行，直到以;结尾的第一行; , the result will be saved on output ，结果将保存在output

使用bash从文本中提取正则表达式组并输出到文件

问题描述

3 个解决方案

解决方案1
1 2015-03-02 00:29:57

解决方案2
1 2015-03-02 02:00:04

解决方案3
0 2015-03-02 00:26:40

使用bash从文本中提取正则表达式组并输出到文件

问题描述

3 个解决方案

解决方案1 1 2015-03-02 00:29:57

解决方案2 1 2015-03-02 02:00:04

解决方案3 0 2015-03-02 00:26:40

解决方案1
1 2015-03-02 00:29:57

解决方案2
1 2015-03-02 02:00:04

解决方案3
0 2015-03-02 00:26:40