简体   繁体   English

sed匹配多行模式

[英]sed matching multiple line pattern

I have a log of following format 我有以下格式的日志

<<
[ABC] some other data
some other data
>>

<<
DEF some other data
some other data
>>

<<
[ABC] some other data
some other data
>>

I wanted to select all logs which are having ABC expected result is 我想选择所有具有ABC预期结果的日志

<<
[ABC] some other data
some other data
>>

<<
[ABC] some other data
some other data
>>

What will the expression for sed command ? sed命令的表达式是什么? For fetching contents b/w << >> expression will be 对于获取内容,黑白<< >>将是

sed -e '/<</,/>>/!d' 

But how can I force it to have [ABC] in b/w 但是我怎么能强迫它在黑白中有[ABC]

This might work for you: 这可能对您有用:

sed '/^<</,/^>>/{/^<</{h;d};H;/^>>/{x;/^<<\n\[ABC\]/p}};d' file
<<
[ABC] some other data
some other data
>>
<<
[ABC] some other data
some other data
>>

sed comes equipped with a register called the hold space (HS). sed配备了一个称为hold space (HS)的寄存器。

You can use the HS to collect data of interest. 您可以使用HS收集感兴趣的数据。 In this case lines between /^<</,/^>>/ 在这种情况下, /^<</,/^>>/

h replaces whatever is in the HS with what is in the pattern space (PS) h用模式空间(PS)中的内容替换HS中的任何内容

H appends a newline \\n and then the PS to the HS H附加换行符\\n ,然后将PS附加到HS

x swaps the HS for the PS x将HS换成PS

NB This deletes all lines other than those between <<...>> containing [ABC] . 注意删除除包含[ABC] <<...>>之间的行以外的所有行。 If you want to retain other lines use: 如果要保留其他行,请使用:

sed '/^<</,/^>>/{/^<</{h;d};H;/^>>/{x;/^<<\n\[ABC\]/p};d}' file
<<
[ABC] some other data
some other data
>>


<<
[ABC] some other data
some other data
 >>

This works on my side: 这对我有效:

awk '$0~/ABC/{print "<<";print;getline;print;getline;print }' temp.txt

tested as below: 测试如下:

pearl.242> cat temp.txt
<< 
[ABC] some other data 
some other data 
>>  
<< 
DEF some other data 
some other data 
>>  

nkeem

<< 
[ABC] some other data 
some other data 
>> 
pearl.243> awk '$0~/ABC/{print "<<";print;getline;print;getline;print }' temp.txt
<<
[ABC] some other data 
some other data 
>>  
<<
[ABC] some other data 
some other data 
>> 
pearl.244> 

If you donot want to hard code this statement print "<<"; 如果您不想硬编码此语句,则print "<<"; ,then you can go for the below: ,那么您可以进行以下操作:

pearl.249> awk '$0~/ABC/{print x;print;getline;print;getline;print}{x=$0}' temp.txt
<< 
[ABC] some other data 
some other data 
>>  
<< 
[ABC] some other data 
some other data 
>> 
pearl.250> 

To me, sed is line based. 对我而言,sed是基于行的。 You can probably talk it into being multi line, but it would be easier to start the job with awk or perl rather than trying to do it in sed. 您可能可以说它是多行的,但是用awk或perl开始工作比尝试在sed中进行工作要容易得多。

I'd use perl and make a little state machine like this pseudo code (I don't guarantee it'll catch every little detail of what you are trying to achieve) 我将使用perl并制作一个类似于此伪代码的状态机(我不保证它会捕获您要实现的每个细节)

state = 0;
for each line
    if state == 0
        if line == '<<'
            state = 1;
    if state == 1
        If line starts with [ABC]
            buffer += line
            state =2
    if state == 2
      if line == >>
          do something with buffer
          state = 0
      else
          buffer += line;

See also http://www.catonmat.net/blog/awk-one-liners-explained-part-three/ for some hints on how you might do it with awk as a 1 liner... 另请参阅http://www.catonmat.net/blog/awk-one-liners-explained-part-three/,以获取有关如何将awk作为1衬纸使用的一些提示...

TXR: built for multi-line stuff. TXR:专为多行内容而建。

@(collect)
<<
[ABC] @line1
@line2
>>
@  (output)
>>
[ABC] @line1
@line2
<<

@  (end)
@(end)

Run: 跑:

$ txr data.txr  data
>>
[ABC] some other data
some other data
<<

>>
[ABC] some other data
some other data
<<

Very basic stuff; 非常基本的东西; you're probably better off sticking to awk until you have a very complicated multi-line extraction job with irregular data with numerous cases, lots of nesting, etc. 您最好还是坚持使用awk,直到您完成了非常复杂的多行提取作业,其中包含不规则数据,大量案例,大量嵌套等。

If the log is very large, we should write @(collect :vars ()) so the collect doesn't implicitly accumulate lists; 如果日志很大,我们应该写@(collect :vars ())这样collect不会隐式地累积列表。 then the job will run in constant memory. 那么作业将在恒定内存中运行。

Also, if the logs are not always two lines, it becomes a little more complicated. 另外,如果日志不总是两行,则变得更加复杂。 We can use a nested collect to gather the variable number of lines. 我们可以使用嵌套收集来收集可变数量的行。

@(collect :vars ())
<<
[ABC] @line1
@  (collect)
@line
@  (until)
>>
@  (end)
@  (output)
>>
[ABC] @line1
@  {line "\n"}
<<

@  (end)
@(end)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM