简体   繁体   中英

Sed progressively for lines between two patterns

Here is my sample code:

BEGIN
one
one
one one
one
END
filler filler filler filler
BEGIN
two two
two
two two
END
filler filler filler filler
BEGIN
three three three
three three
three
END

I want to extract the lines between (and including) BEGIN and END . I have a sed that does this already:

sed '/BEGIN/,/END/!d' file

But I'd like to extract the pattern space progressively. Which is to say, what can I do to the sed command above to get to output only the first block? And then the second Block? And the third? etc...

(As some of you may guess, my end goal is to parse through a file with x509 certificates and extract data on each certificate in the file, rather than just the first certificate in a file which openssl does by default. If there is an easier alternative than the above, I'm all ears).

I'm not sure you can easily do that in sed , but you can in awk :

awk '/^BEGIN$/         { file = sprintf("file%d.out", ++i); }
     /^BEGIN$/,/^END$/ { print > file }' data

This generates file1.out for the first block, file2.out for the second, etc.


Could you explain the working parts to your awk?

The first rule line matches lines that contain BEGIN and generates a file name in the variable file using a counter in variable i (pre-incremented, so the first file is file1.out ).

The second rule line matches ranges of lines from BEGIN to END and uses print (aka print $0 ) redirected to the current file specified by the variable file . Thus it writes to the relevant file each time.

Also, how would you change it to instead output the contents to stdout? I was hoping for a way to specify a "Nth" pattern argument, which I was going to supply from a simple for loop that was running for as many times as the pattern "BEGIN" was found to get a total count.

You can do that by using one line to count the blocks and skip all except the relevant one, and then simply print the data for the block that is relevant.

awk -v N=$N '/^BEGIN$/         { if (++i != N) next; }
             /^BEGIN$/,/^END$/ { print }' data

The -v N=$N relays the shell variable $N to awk ; the first line counts (using i the sections, skipping all except the N th . The second line only gets triggered when the first line does not skip it, so it prints the contents of the N th block. Some awk afficionados (who are probably APL programmers in their spare time) would omit the { print } block, but I think it makes the code clearer to anyone else who has to maintain the code.

it is possible to use opposite way. Do not print default and print only lines between patterns

sed -n '/BEGIN/,/END/p' <file

Using awk, export the second record only, and no need go through the whole file. You will get the result in file "file.out". You can define the number (n=2) by yourself.

n=2
awk -v N=$n '/^BEGIN$/{++i}
     /^BEGIN$/,/^END$/ { if (i==N) {print > "file.out";quit}}'  file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM