简体   繁体   中英

How to remove n characters from a specific column using sed/awk/perl

I have the following tab delimited data:

chr1    3119713 3119728 MA05911Bach1Mafk    839 +
chr1    3119716 3119731 MA05011MAFNFE2  860 +
chr1    3120036 3120051 MA01502Nfe2l2   866 +

What I want to do is to remove 7 characters from 4th column. Resulting in

chr1    3119713 3119728 Bach1Mafk   839 +
chr1    3119716 3119731 MAFNFE2 860 +
chr1    3120036 3120051 Nfe2l2  866 +

How can I do that? Note the output needs to be also TAB separated.

I'm stuck with the following code, which replaces from the first column onward, which I don't want

sed 's/^.\{7\}//' myfile.txt
 awk  '{ $4 = substr($4, 8); print }'
perl -anE'$F[3] =~ s/.{7}//; say join "\t", @F' data.txt

要么

perl -anE'substr $F[3],0,7,""; say join "\t", @F' data.txt

With sed

$ sed -E 's/^(([^\t]+\t){3}).{7}/\1/' myfile.txt
chr1    3119713 3119728 Bach1Mafk   839 +
chr1    3119716 3119731 MAFNFE2 860 +
chr1    3120036 3120051 Nfe2l2  866 +
  • -E use extended regular expressions, to avoid having to use \\ for (){} . Some sed versions might need -r instead of -E
  • ^(([^\\t]+\\t){3}) capture the first three columns, easy to change number of columns if needed
  • .{7} characters to delete from 4th column
  • \\1 the captured columns
  • Use -i option for in-place editing


With perl you can use \\K for variable length positive lookbehind

perl -pe 's/^([^\t]+\t){3}\K.{7}//' myfile.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM