简体   繁体   中英

Bash: Converting 4 columns of text interleaved lines (tab-delimited columns to FASTQ file)

I need to convert a 4-column file to 4 lines per entry. The file is tab-delimited.

The file at current is arranged in the following format, with each line representing one record/sequence (with millions of such lines):

@SRR1012345.1   NCAATATCGTGG    #4=DDFFFHDHH    HWI-ST823:136:C24YTACXX
@SRR1012346.1   GATTACAGATCT    #4=DDFFFHDHH    HWI-ST823:136:C22YTAGXX

I need to rearrange this such that the four columns are presented as 4 lines:

@SRR1012345.1
NCAATATCGTGG
#4=DDFFFHDHH
HWI-ST823:136:C24YTACXX
@SRR1012346.1
GATTACAGATCT
#4=DDFFFHDHH
HWI-ST823:136:C22YTAGXX

What would be the best way to go about doing this, preferably with a bash one-liner? Thank you for your assistance!

您可以使用tr

< file tr '\t' '\n' > newfile

很清楚在这里使用awk:

awk '{print $1; print $2; print $3; print $4}' file
$ awk -v OFS='\n' '{$1=$1}1' file
@SRR1012345.1
NCAATATCGTGG
#4=DDFFFHDHH
HWI-ST823:136:C24YTACXX
@SRR1012346.1
GATTACAGATCT
#4=DDFFFHDHH
HWI-ST823:136:C22YTAGXX

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM