so I have a project for uni, and I can't get through the first exercise. Here is my problem: I have a file, and I want to select some data inside of it and 'display' it in another file. But the data I'm looking for is a little bit scattered in the file, so I need several awk commands in my script to get them.
Query= fig|1240086.14.peg.1
Length=76
Score E
Sequences producing significant alignments: (Bits) Value
fig|198628.19.peg.2053 140 3e-42
> fig|198628.19.peg.2053
Length=553
Here on the picture, you can see that there are 2 types of 'Length=', and I only want to 'catch' the "Length=" that are just after a "Query=". I have to use awk so I tried this :
awk '{if(/^$/ && $(NR+1)/^Length=/) {split($(NR+1), b, "="); print b[2]}}'
but it doesn't work... does anyone have an idea?
awk
solution:
awk '/^Length=/ && r~/^Query/{ sub(/^[^=]+=/,""); printf "%s ",$0 }
NF{ r=$0 }END{ print "" }' file
NF{ r=$0 }
- capture the whole non-empty line /^Length=/ && r~/^Query/
- on encountering Length
line having previous line started with Query
(ensured by r~/^Query/
) You need to understand how Awk works. It reads a line, evaluates the script, then starts over, reading one line at a time. So there is no way to say "the next line contains this". What you can do is "if this line contains, then remember this until ..."
awk '/Query=/ { q=1; next } /Length/ && q { print } /./ { q=0 }' file
This sets the flag q
to 1 (true) when we see Query=
and then skips to the next line. If we see Length
and we recently saw Query=
then q
will be 1, and so we print. In other cases, set q
back to "not recently seen" on any non-empty line. (I put in the non-empty condition to allow for empty lines anywhere without affecting the overall logic.)
It sounds like this is what you want for the first part of your question:
$ awk -F'=' '!NF{next} f && ($1=="Length"){print $2} {f=($1=="Query")}' file
76
but idk what the second part is about since there's no "data" lines in your input and only 1 valid output from your sample input best I can tell.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.