简体   繁体   中英

remove all characters after last occurence of a pattern

I'm parsing a txt file such as :

>scaffold_1:52559-5269(+):mus_musculus:15-207(+)
AAAGAAAATAATAAAGAAA
>scaffold_2:27092-2200(+):mus_musculus:0-105(+)
AAAGAAAATAAT

and the idea is to remove all part after the last : occurrence and get :

>scaffold_1:52559-5269(+):mus_musculus
AAAGAAAATAATAAA
>scaffold_2:27092-2200(+):mus_musculus
AAAGAAAATAAT

I know the sed command but not for the last occurence. Thanks for your help.

替换冒号,然后替换任意数量的非冒号。

sed 's/:[^:]*$//'

Another cut based:

$ cut -d : -f -3 file

Output:

>scaffold_1:52559-5269(+):mus_musculus
AAAGAAAATAATAAAGAAA
>scaffold_2:27092-2200(+):mus_musculus
AAAGAAAATAAT

A cut based solution :

while read line
do 
  echo "$line" | rev | cut -d: -f2- | rev 
done < file.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM