通過多行在兩個正則表達式模式之間輸出文本

Question

如果將myfile帶到可用python的環境中，則可以運行以下命令：

cat myfile | python filter.py

filter.py

import sys

results = []
for line in sys.stdin:
    results.append(line.rstrip("\n\r")) 

start_match = "some text"
lines_to_include_before_start_match = 4
end_match = "some other text"
lines_to_include_after_end_match = 4

for line_number, line in enumerate(results):
    if start_match in line:
        for x in xrange(line_number-lines_to_include_before_start_match, line_number):
            print results[x]

        print line

        for x in xrange(line_number+1, len(results)):       
            if end_match in results[x]:
                print results[x]

                for z in xrange(x+1, x+lines_to_include_after_end_match):
                    print results[z]

                break
            else:
                print results[x]

        print ""

但是我要在其中運行的環境沒有python。 我是否唯一選擇將其轉換為perl（我知道環境中存在）？ 是否有簡單的sed或awk命令來執行此操作？

我已經嘗試了以下方法，但是由於缺少+/- 4行，因此並不能完全滿足我的要求：

cat myfile | sed -n '/some text/,/some other text/p'

[編輯：Python腳本說lines_to_include_after_end_match為4，但實際上返回3]

Answer 1

假設行結尾為\\n ，則可以嘗試以下操作：

awk '/some text/{if(l4)printf l4;p=5} /some other text/{e=1} e && p {p--; if (!p) {e=0;l4="";}} !p && !e { l4 = l4 $0 "\n"; sub(/[^\n]*\n(([^\n]*\n){4})/,"\1",l4);} p' file

請注意，如果您想在結束比賽后再打印4行，則標記必須為6。
我認為您自己的python代碼在結束比賽后只會再打印3行。

放入幾行以確保可重現性：

awk '/some text/{if(l4)printf l4;p=5} 
    /some other text/{e=1} 
    e && p {p--; if (!p) {e=0;l4="";}} 
    !p && !e { l4 = l4 $0 "\n"; sub(/[^\n]*\n(([^\n]*\n){4})/,"\1",l4);} 
    p' file

Answer 2

這可能對您有用（GNU sed）：

sed ':a;$!{N;s/\n/&/4;Ta};/1st text/{:b;n;/2nd text/!bb;:c;N;s/\n/&/4;Tc;b};$d;D' file

打開一個包含n行的窗口，如果這些行包含1st text行2nd text ，則將其打印並繼續打印直到2nd text ，然后再讀取m行並進行打印。 否則，如果它是文件的末尾，則刪除緩沖的行，否則刪除緩沖區中的第一行並重復。

如果匹配文本始於行的開頭或結尾，請使用：

sed ':a;$!{N;s/\n/&/4;Ta};/^start/M{:b;n;/end$/M!bb;:c;N;s/\n/&/4;Tc;b};$d;D' file

Answer 3

使用sed ，請嘗試：

sed -n "$(($(sed -n '/some text/=' myfile) - 4)),$(($(sed -n '/some other text/=' myfile) + 4))p" myfile

命令sed -n '/some text/='返回與some text匹配的行號。
然后從上面的數字中減去4。
下一部分sed -n '/some other text/='工作原理類似，將獲得的行號加4。

請注意，腳本會掃描輸入文件3次，並且可能不適用於執行時間至關重要的情況。

[編輯]

如果文件中有多個"some other text" ，請改用：

sed -n "$(($(sed -n '/some text/=' myfile) - 4)),\$p" myfile | sed "/some other text/{N;N;N;q}"

通過多行在兩個正則表達式模式之間輸出文本

問題描述

3 個解決方案

解決方案1
0 2019-02-10 06:17:59

解決方案2
0 2019-02-10 11:41:59

解決方案3
0 2019-02-10 11:45:49

通過多行在兩個正則表達式模式之間輸出文本

問題描述

3 個解決方案

解決方案1 0 2019-02-10 06:17:59

解決方案2 0 2019-02-10 11:41:59

解決方案3 0 2019-02-10 11:45:49

解決方案1
0 2019-02-10 06:17:59

解決方案2
0 2019-02-10 11:41:59

解決方案3
0 2019-02-10 11:45:49