sed find and replace fastq regex

Question

I have a file such as

head testSed.fastq
@M01551:51:000000000-BCB7H:1:1101:15800:1330 1:N:0:NGTCACTN+TATCCTCTCTTGAAGA
NGTCACTN
+
#>AAAAF#
@M01551:51:000000000-BCB7H:1:1101:15605:1331 1:N:0:NATCAGCN+TAGATCGCCAAGTTAA
NATCAGCN
+
#>>AA?C#
@M01551:51:000000000-BCB7H:1:1101:15557:1332 1:N:0:NCAGCAGN+TATCTTCTATAAATAT
NCAGCAGN

And I am attempting to replace the string after the final colon with 0 (in this example on lines 1,5,9 - but globally) using a regular expression.

I have checked my regex using egrep egrep '[ATGCN]{8}\\+[ATGCN]{16}$' testSed.fastq which returns all the lines I would expect.

However when I try to use sed -i 's/[ATGCN]{8}\\+[ATGCN]{16}$/0/g' testSed.fastq the original file is unchanged and no replacement occurs.

How can I fix this? Is my regex not specific enough?

Answer 1

Do you need a regex for this?

awk -F: -v OFS=: '/^@/ {$NF = "0"} 1' testfile

That won't save in-place. If you have GNU awk you can

gawk -F: -v OFS=: -i inplace '...' file

ref: https://www.gnu.org/software/gawk/manual/html_node/Extension-Sample-Inplace.html

Answer 2

Your regex is structured as an ERE rather than a BRE, which is sed's default interpretation. Not all sed implementations support ERE, but you can check man sed in your environment to determine whether it's possible for you. Look for -r or -E options. You can alternately use bounds by preceding the curly braces with backslashes.

That said, rather than matching the precise text in the last field, why not just look for the string that starts with a colon, and is followed by no-more-colons? The following RE is both BRE and ERE compatible.

$ sed '/^@/s/:[^:]*$/:0/' testq
@M01551:51:000000000-BCB7H:1:1101:15800:1330 1:N:0:0
NGTCACTN
+
#>AAAAF#
@M01551:51:000000000-BCB7H:1:1101:15605:1331 1:N:0:0
NATCAGCN
+
#>>AA?C#
@M01551:51:000000000-BCB7H:1:1101:15557:1332 1:N:0:0
NCAGCAGN

sed find and replace fastq regex

Question

2 answers

solution1
2 2017-10-24 16:48:54

solution2
1 ACCPTED 2017-10-24 16:54:34

sed find and replace fastq regex

Question

2 answers

solution1 2 2017-10-24 16:48:54

solution2 1 ACCPTED 2017-10-24 16:54:34

solution1
2 2017-10-24 16:48:54

solution2
1 ACCPTED 2017-10-24 16:54:34