如何仅匹配模式的前N个实例，然后在每个模式后面打印行直到空白行？

Question

I have a log file summarising calculation results that I need to prepare for analysis. 我有一个日志文件，总结了我需要准备分析的计算结果。 Each result is given a heading, of the form: 每个结果都有一个标题，形式如下：

 Excited State   1:      Triplet-A      3.1118 eV  398.43 nm  f=0.0000"

Followed by an unknown number of data lines of the form: 其次是表格数量未知的数据：

"76 -> 81  0.36917"

(an integer, an arrow, another integer, then a float). （整数，箭头，另一个整数，然后是浮点数）。 Each result is separated from the next result by a blank line. 每个结果通过空行与下一个结果分开。 I want to be able to take the first two sets (including the data lines) of results where the heading contains the pattern "Triplet". 我希望能够获得结果的前两组（包括数据行），其中标题包含模式“Triplet”。 Later, I need to be able to do the same for the "Singlet" pattern, so I can't just delete those. 后来，我需要能够为“Singlet”模式做同样的事情，所以我不能删除它们。

Unfortunately, it is important for later analysis that the data lines be kept separated in some way, as I will need to order the data lines in decreasing order of magnitude (by the float column). 不幸的是，对于以后的分析来说，重要的是数据线应该以某种方式保持分离，因为我需要按照数量级的递减顺序排列数据线（通过浮点列）。

I have been able to use sed to return all instances of the Triplet headings and following data lines (until the blank line), as follows: 我已经能够使用sed返回Triplet标题的所有实例并跟随数据行（直到空白行），如下所示：

sed -n '/Triplet/,/^ *$/p' test.txt sed -n'/ Triplet /，/ ^ * $ / p'test.txt

But I don't know how to get only the first two instances. 但我不知道如何只获得前两个实例。

Ideally, if the input file looks like the following: 理想情况下，如果输入文件如下所示：

 Excited State   1:      Triplet-A      3.1118 eV  398.43 nm  f=0.0000
76 -> 81         0.36917
76 ->101         0.11911
...

Excited State   2:      Singlet-A      3.3656 eV  379.43 nm  f=0.0029
76 -> 81         0.38068
76 ->101         0.10777
...

Excited State   3:      Triplet-A      3.1118 eV  398.43 nm  f=0.0000
76 -> 81         0.36917
76 ->101         0.11911
...
...

I'd like to be able to get: 我希望能得到：

Excited State   1:      Triplet-A      3.1118 eV  398.43 nm  f=0.0000
76 -> 81         0.36917
76 ->101         0.11911
...

Excited State   3:      Triplet-A      3.1118 eV  398.43 nm  f=0.0000
76 -> 81         0.36917
76 ->101         0.11911
...

And while, in this case, I could just remove the second data set, that won't generalise. 虽然在这种情况下，我可以删除第二个数据集，但不会概括。

Answer 1

$ awk '/Triplet/ { n += 1 } n <= 2 && /Triplet/,/^ *$/' input.txt
 Excited State   1:      Triplet-A      3.1118 eV  398.43 nm  f=0.0000
76 -> 81         0.36917
76 ->101         0.11911
...

Excited State   3:      Triplet-A      3.1118 eV  398.43 nm  f=0.0000
76 -> 81         0.36917
76 ->101         0.11911
...
...

Answer 2

A gnu awk version (gnu due to RS with multiple characters) 一个gnu awk版本（由于RS有多个字符的gnu）

awk -v RS='Excited State' '/Triplet/ {if (n++<2) printf "%s",RS$0}' file
Excited State   1:      Triplet-A      3.1118 eV  398.43 nm  f=0.0000
76 -> 81         0.36917
76 ->101         0.11911
...

Excited State   3:      Triplet-A      3.1118 eV  398.43 nm  f=0.0000
76 -> 81         0.36917
76 ->101         0.11911
...
...

RS='Excited State' set record selector to Excited State so awk works in block mode RS='Excited State'将记录选择器设置为Excited State因此awk在块模式下工作
/Triplet/ test if line contains Triplet if so: /Triplet/ test如果行包含Triplet如果是这样：
- if (n++<2) test if counter is less then two starting by zero to get two block only, then: if (n++<2)测试计数器是否小于2，则从零开始只得到两个块，然后：
- - print RS$0 print record selector and block print RS$0打印记录选择器和块

PS this will work even if blank line is missing between blocks 即使块之间缺少空白行，PS也能正常工作

Answer 3

This might work for you (GNU sed): 这可能适合你（GNU sed）：

sed -E '/Triplet/{x;s/^/x/;/^x{1,2}$/{x;:a;n;/\S/ba;p;x};x};d' file

Focus on a line containing Triplet and after incrementing a counter in the hold space, determine if to print that line upto and including an empty one. 将焦点放在包含Triplet的行上，并在保持空间中递增计数器后，确定是否打印该行并包括空行。

Answer 4

如果所有记录之间都有空行，则可以轻松执行以下操作：

$ awk 'BEGIN{RS="";FS=OFS="\n";n=2}($1~/Triplet/ && n-->0);(n==0){exit}' file

如何仅匹配模式的前N个实例，然后在每个模式后面打印行直到空白行？

问题描述

4 个解决方案

解决方案1
2 已采纳 2019-09-11 01:04:38

解决方案2
1 2019-09-11 05:09:00

解决方案3
0 2019-09-11 09:01:00

解决方案4
0 2019-09-11 12:16:24

如何仅匹配模式的前N个实例，然后在每个模式后面打印行直到空白行？

问题描述

4 个解决方案

解决方案1 2 已采纳 2019-09-11 01:04:38

解决方案2 1 2019-09-11 05:09:00

解决方案3 0 2019-09-11 09:01:00

解决方案4 0 2019-09-11 12:16:24

解决方案1
2 已采纳 2019-09-11 01:04:38

解决方案2
1 2019-09-11 05:09:00

解决方案3
0 2019-09-11 09:01:00

解决方案4
0 2019-09-11 12:16:24