[英]Replace text in file by hash values with matching keys
I would like to replace all words in a file matching the keys of my hash with corresponding values. 我想用相应的值替换匹配我的哈希键的文件中的所有单词。
$VAR1 = {
'asmbl_1' => 'TCONS_00000046',
'asmbl_2' => 'TCONS_00000014',
'asmbl_16' => 'MELO3C000012',
}
CM3.6.1_CONTIG30890 assembler transcript 187 1568 . - . gene_id "PASA_cluster_1"; transcript_id "align_id:184317|asmbl_1";
CM3.6.1_CONTIG30890 assembler exon 187 251 . - . gene_id "PASA_cluster_1"; transcript_id "align_id:184317|asmbl_1";
CM3.6.1_CONTIG30898 assembler exon 1339 2793 . - . gene_id "PASA_cluster_2"; transcript_id "align_id:184318|asmbl_2";
CM3.6.1_CONTIG30890 assembler transcript 187 1568 . - . gene_id "PASA_cluster_1"; transcript_id "align_id:184317|TCONS_00000046";
CM3.6.1_CONTIG30890 assembler exon 187 251 . - . gene_id "PASA_cluster_1"; transcript_id "align_id:184317|TCONS_00000046";
CM3.6.1_CONTIG30898 assembler exon 1339 2793 . - . gene_id "PASA_cluster_2"; transcript_id "align_id:184318|TCONS_00000014";
I'm looking for a straightforward way to do this, preferably in Perl, since I'm writing a script in Perl. 我正在寻找一种简单的方法来执行此操作,最好是在Perl中,因为我正在用Perl编写脚本。
(What is the difference between these both methods?) (这两种方法有什么区别?)
sed -i '/key/value/'
". sed -i '/key/value/'
”。 A bit ugly, I would prefer to do all in Perl. There's a nice trick I like, that basically involves building a regex and using that to capture and match your regex: 我喜欢一个不错的技巧,基本上涉及构建一个正则表达式,并使用它来捕获和匹配您的正则表达式:
use strict;
use warnings;
my %replace = (
'asmbl_1' => 'TCONS_00000046',
'asmbl_2' => 'TCONS_00000014',
'asmbl_16' => 'MELO3C000012',
);
my $search = join( "|", map {quotemeta} sort { length ($b) <=> length ($a) } keys %replace );
$search = qr/\b($search)\b/;
while (<>) {
s/$search/$replace{$1}/g;
print;
}
Something like that produces the desired output. 诸如此类的东西会产生所需的输出。 (Diamond operators to read the content off
STDIN
or invocation via myscript.pl <some_File_To_process>
(钻石运算符从
STDIN
读取内容或通过myscript.pl <some_File_To_process>
调用
This is all that is necessary 这就是所有必要的
use strict;
use warnings;
my %map = (
asmbl_1 => 'TCONS_00000046',
asmbl_2 => 'TCONS_00000014',
asmbl_16 => 'MELO3C000012',
);
my $re = join '|', map quotemeta, keys %map;
while ( <DATA> ) {
s/\b($re)\b/$map{$1}/g;
print;
}
__DATA__
CM3.6.1_CONTIG30890 assembler transcript 187 1568 . - . gene_id "PASA_cluster_1"; transcript_id "align_id:184317|asmbl_1";
CM3.6.1_CONTIG30890 assembler exon 187 251 . - . gene_id "PASA_cluster_1"; transcript_id "align_id:184317|asmbl_1";
CM3.6.1_CONTIG30898 assembler exon 1339 2793 . - . gene_id "PASA_cluster_2"; transcript_id "align_id:184318|asmbl_2";
CM3.6.1_CONTIG30890 assembler transcript 187 1568 . - . gene_id "PASA_cluster_1"; transcript_id "align_id:184317|TCONS_00000046";
CM3.6.1_CONTIG30890 assembler exon 187 251 . - . gene_id "PASA_cluster_1"; transcript_id "align_id:184317|TCONS_00000046";
CM3.6.1_CONTIG30898 assembler exon 1339 2793 . - . gene_id "PASA_cluster_2"; transcript_id "align_id:184318|TCONS_00000014";
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.