简体   繁体   中英

Move lines in file using awk/sed

Hi my files look like:

>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA
>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA

and I want to move the lines so that line 1 swaps with 3, and line 2 swaps with 4.

>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

I have thought about using cut so cut send the lines into other files, and then bring them all back in the desired order using paste , but is there a solution using awk/sed.

EDIT: The file always has 4 lines (2 fasta entrys), no more.

For such a simple case, as @Ed_Morton mentioned, you can just swap the even-sized slices with head and tail commands:

$ tail -2 test.txt; head -2 test.txt

>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

Generic solution with GNU tac to reverse contents:

$ tac -bs'>' ip.txt
>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

By default tac reverses line wise but you can customize the separator.

Here, I'm assuming > can be safely used as a unique separator (provided to the -s option). The -b option is used to put the separator before the content in the output.


Using ed (inplace editing):

# move 3rd to 4th lines to the top
printf '3,4m0\nwq\n' | ed -s ip.txt

# move the last two lines to the top
printf -- '-1,$m0\nwq\n' | ed -s ip.txt

Using sed:

sed '1h;2H;1,2d;4G'
  • Store the first line in the hold space;
  • Add the second line to the hold space;
  • Don't print the first two lines;
  • Before printing the fourth line, append the hold space to it (ie append the 1st and 2nd line).

GNU AWK manual has example of swapping two lines using getline as you know that

The file always has 4 lines (2 fasta entrys), no more.

then you might care only about case when number of lines is evenly divisble by 4 and use getline following way, let file.txt content be

>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA
>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA

then

awk '{line1=$0;getline line2;getline line3;getline line4;printf "%s\n%s\n%s\n%s\n",line3,line4,line1,line2}' file.txt

gives output

>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

Explanation: store current line in variable $0 , then next line as line2 , yet next line as line3 , yet next line as line4 , use printf with 4 placeholders ( %s ) followed by newlines ( \n ), which are filled accordingly to your requirement.

(tested in GNU Awk 5.0.1)

GNU sed:

sed -zE 's/(.*\r?\n)(.*\r?\n?)/\2\1/' file 

A Perl:

perl -0777 -pe 's/(.*\R.*\R)(.*\R.*\R?)/\2\1/' file

A ruby:

ruby -ne 'BEGIN{lines=[]}
lines<<$_
END{puts lines[2...4]+lines[0...2] }' file 

Paste and awk:

paste -s file | awk -F'\t' '{print $3, $4, $1, $2}' OFS='\n'

A POSIX pipe:

paste -sd'\t\n' file | nl | sort -nr | cut -f 2- | tr '\t' '\n'

This seems to work:

awk -F'\n' '{print $3, $4, $1, $2}' OFS='\n' RS= ORS='\n\n' file.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM