I have next file:
G
H
A
B
C
D
N
Let's delete lines from A to D and we will get next output:
G
H
N
It's very easy to do with sed '/A/,/D/d
, but if my file don't have D, then output will be empty. I want if there isn't second pattern (D) do not delete anything and show full file.
Second question - how to delete lines between patterns and next line after (N)? Kind of sed '/A/,+1d
, but sed '/A/,/D/+1d
will not work.
There is no different for me to use sed, awk or python/bash scripts.
With 2 pass awk you can do this:
# when 2nd pattern is not found
awk -v ps='A' -v pe='P' 'NR==FNR{if ($0 ~ ps) start=FNR; else if ($0 ~ pe) stop=FNR; if (stop) nextfile; else next} !stop || FNR<start || FNR>stop' file file
G
H
A
B
C
D
N
# when 2nd pattern is found
awk -v ps='A' -v pe='D' 'NR==FNR{if ($0 ~ ps) start=FNR; else if ($0 ~ pe) stop=FNR; if (stop) nextfile; else next} !stop || FNR<start || FNR>stop' file file
G
H
N
About your 2nd part you can tweak this awk a bit with another parameter:
awk -v n=2 -v ps='A' -v pe='D' 'NR==FNR {
if ($0 ~ ps)
start=FNR
else if ($0 ~ pe)
stop=FNR+n
if (stop)
nextfile
else
next
}
!stop || FNR<start || FNR>stop' file file
One option out of many that use perl: hold the text in an accumulator once you see A
, then print them at the end if you didn't see D
. That way you only make one pass through the file (although you use a lot of memory for big files!).
use strict; use warnings;
my $accumulator = ''; # Text we're holding while we wait for a "D"
my $printing = 1; # Are we currently printing lines?
while(<>) {
if(/A/) { # stop printing; start accumulating
$printing = 0;
$accumulator .= $_; # $_ is the current line
next;
}
if(/D/) { # we got a D, so we're back to printing
$accumulator = ''; # discard the text we now know we're deleting
$printing = 1;
next;
}
if($printing) {
print;
} else {
$accumulator .= $_;
}
}
print $accumulator; # which is empty if we had both A and D
I tried this on your testcase, and on your testcase with the D
removed. It can also handle files with multiple A
/ D
pairs. I have not tested it on files where the D
comes before the A
, or on files with a single line including both A
and D
.
With D
:
$ awk '
/A/ || f {
f=1 # flag up
b=b (b==""?"":ORS)$0 # buffer...
if(/D/) { # until D
print b # output...
b=f="" # and reset buffer and flag
}
next
}
END { # output if rows runout before finding D
if(b)
print b
}1' file # output outside the range
G
H
A
B
C
D
N
With missing D
:
$ cat file2
G
H
A
B
C
N
$ awk '/A/||f{f=1;b=b (b==""?"":ORS)$0;if(/D/){print b;b=f=""}next}END{if(b)print b}1' file2
G
H
A
B
C
N
This might work for you (GNU sed):
sed '/A/{:a;N;/D/!ba;d}' file
Match on A
and then gather up lines in the pattern space until a match on D
then delete the pattern space. If a D
is not matched the N
command will terminate sed processing and by default this prints whatever is found in the pattern space.
To delete as above +1, use:
sed '/A/{:a;N;/D/!ba;N;d}' file
NB If the +1 line does not exist, no lines are deleted.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.