I have a giant .txt file formatted as following (each non-blank line starts with triple whitespace):
unwanted text
unwanted text
*wanted text
abc
def
*wanted text 2
content
content
*wanted text 3
content
content
(...)
I'm looking for a code that returns me only the lines from the first " *" ocurrence until (but excluding) the second " *" ocurrence.
Surfing through multiple StackOverflow posts, i've managed to get the following working code, using Ubuntu (GNU/Linux):
sed -n -e '/^ \*/{p;q}' bigfile.txt && sed -e '1,/ \*/d' -e '/ \*/,$d' bigfile.txt
It gives me the following (as wanted) output:
*wanted text
abc
def
\n (representing a wanted blank line)
Though it's exactly the output I want, you have to agree with me, it's a kinda dumb code, since i have to use sed twice. First I had only the 2nd part of it (after "&&") and would return the right thing except for the first line (*wanted text). I've then appended this first part of code (before "&&") so I get also the first line of the wanted part. Every other piece of code I've tried didn't get me any better result.
It's never enough to say, it's a very big file, and I'll be doing this recursively in a script so, if possible, a /q (quitting after find the first result) is preferable.
After this is done, i need something that would take the result of the last command as the input, so i can get the exactly the whole text EXCEPT the prior result, like such:
unwanted text
unwanted text
*wanted text 2
content
content
*wanted text 3
content
content
(...)
So, in summary, my 2 questions are:
Hope i'm clear enough. Please ask me if any detail is missing. Thank you very much for your attention!
awk
to the rescue!
$ awk '$1~/^*/{if(f) exit; f=1} f' file
*wanted text
abc
def
<-- here is the empty line formatter eats
for the second part
$ awk '$1~/^*/{f++} !f||f>1' file
unwanted text
unwanted text
*wanted text 2
content
content
*wanted text 3
content
content
(...)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.