Using Bash to Manually Edit a Text or Fastq file

Question

I would like to manually edit a Fastq file using Bash to multiple similar lines.

In Fastq files a sequence read starts on line 2 and then is found every fourth line (ie lines 2,6,10,14...).

I would like to create an edited text file that is identical to a Fastq file except the first 6 characters of the sequencing reads are trimmed off.

Unedited Fastq:

@M03017:21:000000000
GAGAGATCTCTCTCTCTCTCT
+
111>>B1FDFFF

Edited Fastq:

@M03017:21:000000000
TCTCTCTCTCTCTCT
+
111>>B1FDFFF

Answer 1

I guess awk is perfect for this:

$ awk 'NR%4==2 {gsub(/^.{6}/,"")} 1' file
@M03017:21:000000000
TCTCTCTCTCTCTCT
+
111>>B1FDFFF

This removes the first 6 characters in all the lines in the 4k+2 position.

Explanation

NR%4==2 {} do things if the number of record (number of line) is on 4k+2 form.
gsub(/^.{6}/,"") replace the 6 first chars with empty string.
1 as evaluated to True, print the line.

Answer 2

GNU sed can do that:

sed -i~ '2~4s/^.\{6\}//' file

The address 2~4 means "start on line 2, repeat each 4 lines".

s means replace, ^ matches the line beginning, . matches any character, \\{6\\} specifies the length (a "quantifier"). The replacement string is empty ( // ).

-i~ replaces the file in place, leaving a backup with the ~ appended to the filename.

Using Bash to Manually Edit a Text or Fastq file

Question

2 answers

solution1
1 2015-02-16 15:57:31

Explanation

solution2
1 ACCPTED 2015-02-16 16:22:54

Using Bash to Manually Edit a Text or Fastq file

Question

2 answers

solution1 1 2015-02-16 15:57:31

Explanation

solution2 1 ACCPTED 2015-02-16 16:22:54

solution1
1 2015-02-16 15:57:31

solution2
1 ACCPTED 2015-02-16 16:22:54