[英]Finding text in a file using perl program
I have a file given below:我有一个文件如下:
>AAF88103.1 zinc finger protein 226 [Homo sapiens]
MNMFKEAVTFKDVAVAFTEEELGLLGPAXRKLYRDVMVENFRNLLSVGHPPFKQDVSPIERNEQLWIMTT
ATRRQGNLGEKNQSKLITVQDRESEEELSCWQIWQQIANDLTRCQDSMINNSQCHKQGDFPYQVGTELSI
QISEDENYIVNKADGPNNTGNPEFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLCQCKKGVDPIGWIS
HHDGHRVHKSEKSYRPNDYEKDNMKILTFDHNSMIHTGQKSYQCNECKKPFSDLSSFDLHQQLQSGEKSL
TCVERGKGFCYSPVLPVHQKVHVGEKLKCDECGKEFSQGAHLQTHQKVHVIEKPYKCKQCGKGFSRRSAL
NVHCKVHTAEKPYNCEECGRAFSQASHLQDHQRLHTGEKPFKCDACGKSFSRNSHLQSHQRVHTGEKPYK
CEECGKGFICSSNLYIHQRVHTGEKPYKCEECGKGFSRPSSLQAHQGVHTGEKSYICTVCGKGFTLSSNL
QAHQRVHTGEKPYKCNECGKSFRRNSHYQVHLVVHTGEKPYKCEICGKGFSQSSYLQIHQKAHSIEKPFK
CEECGQGFNQSSRLQIHQLIHTGEKPYKCEECGKGFSRRADLKIHCRIHTGEKPYNCEECGKVFRQASNL
LAHQRVHSGEKPFKCEECGKSFGRSAHLQAHQKVHTGDKPYKCDECGKGFKWSLNLDMHQRVHTGEKPYK
CGECGKYFSQASSLQLHQSVHTGEKPYKCDVCGKVFSRSSQLQSHQRVHTGEKPYKCEICGKSFSWRSNL
TVHHRIHVGDKSYKSNRGGKNIRESTQEKKSIK.
In this file i am trying to look for a sequence ie: CDECGKEFSQGAHLQTHQKVH I have to hardcode this pattern in the program and then look for it, my code is as follows在这个文件中我试图寻找一个序列即: CDECGKEFSQGAHLQTHQKVH我必须在程序中硬编码这个模式然后寻找它,我的代码如下
open FILE1, "file.fasta" or die;
while (my $line= <FILE1>) {
chomp $line;
}
if ($line =~ /CDECGKEFSQGAHLQTHQKVH/) {
print "The protein contains the domain";
}else{
print "The protein doesn't contain the domain";
}
Now this pattern occurs in the sequence but i always get the message "The protein doesn't contain the domain".现在这种模式出现在序列中,但我总是收到消息“蛋白质不包含域”。 Am i doing it wrong?
我做错了吗?
The $line evaluation occurs outside of the loop... Give this a try: $line 评估发生在循环之外......试一试:
my $found = 0;
open(my $fh, '<', 'file.fasta') or die "Could not open file '$filename': $!";
while (my $line = <$fh>) {
if ($line =~ /CDECGKEFSQGAHLQTHQKVH/) {
$found = 1;
last;
}
}
if ($found == 1) {
print "The protein contains the domain";
} else {
print "The protein doesn't contain the domain";
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.