简体   繁体   中英

How can i remove multiple lines from a file based on a pattern that spans multiple lines?

I have a text formatted like the following:

2020-05-02
apple
string
string
string
string
string
2020-05-03
pear
string
string
string
string
string
2020-05-03
apple
string
string
string
string
string

Each group has 7 lines = Date, Fruit and then 5 strings.

I would like to delete groups of 7 lines from the file by supplying just the date and the fruit.

So if choose '2020-05-03' and 'pear'

this would remove:

2020-05-03
pear
string
string
string
string
string

from the file, resulting in this:

2020-05-02
apple
string
string
string
string
string
2020-05-03
apple
string
string
string
string
string

The file contains thousands of lines, I need a command, probably using sed or awk to:

  1. Search for date 2020-05-03

  2. Check if string after date is pear

  3. delete both lines and following 5 lines

I know i can delete with sed like sed s'/string//g' , however i am not sure if i can delete multiple lines.

Note: Date followed by fruit is never repeated twice so

2020-05-02
pear

would only occur once in the file

How can i acheive this?

Using awk, you may do this:

awk -v dt='2020-05-03' -v ft='pear' '$1==dt{p=NR} p && NR==p+1{del=($1==ft)}
del && NR<=p+6{next} 1' file

2020-05-02
apple
string
string
string
string
string
2020-05-03
apple
string
string
string
string
string

Explanation:

  • -v dt='2020-05-03' -v ft='pear' : Supply 2 values to awk from command line
  • $1==dt{p=NR} : If we find a line with matching date then store line no in variable p
  • p && NR==p+1{del=($1==ft)} : If p>0 and we are at next line then set a flag del to 1 if we have matching fruit name otherwise set that flag to 0 .
  • del && NR<=p+6{next} : If flag del is set then skip next 6 lines
  • 1 : Default action to print line

This might work for you (GNU sed):

sed '/2020-05-03/{:a;N;s/[^\n]*/&/7;Ta;/^[^\n]*\npear/d}' file

If a line contains 2020-05-03 gather up in total 7 lines and if the 2nd of these lines contains pear delete them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM