簡體   English   中英

使用 perl 程序在文件中查找文本

[英]Finding text in a file using perl program

我有一個文件如下:

>AAF88103.1 zinc finger protein 226 [Homo sapiens]
MNMFKEAVTFKDVAVAFTEEELGLLGPAXRKLYRDVMVENFRNLLSVGHPPFKQDVSPIERNEQLWIMTT
ATRRQGNLGEKNQSKLITVQDRESEEELSCWQIWQQIANDLTRCQDSMINNSQCHKQGDFPYQVGTELSI
QISEDENYIVNKADGPNNTGNPEFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLCQCKKGVDPIGWIS
HHDGHRVHKSEKSYRPNDYEKDNMKILTFDHNSMIHTGQKSYQCNECKKPFSDLSSFDLHQQLQSGEKSL
TCVERGKGFCYSPVLPVHQKVHVGEKLKCDECGKEFSQGAHLQTHQKVHVIEKPYKCKQCGKGFSRRSAL
NVHCKVHTAEKPYNCEECGRAFSQASHLQDHQRLHTGEKPFKCDACGKSFSRNSHLQSHQRVHTGEKPYK
CEECGKGFICSSNLYIHQRVHTGEKPYKCEECGKGFSRPSSLQAHQGVHTGEKSYICTVCGKGFTLSSNL
QAHQRVHTGEKPYKCNECGKSFRRNSHYQVHLVVHTGEKPYKCEICGKGFSQSSYLQIHQKAHSIEKPFK
CEECGQGFNQSSRLQIHQLIHTGEKPYKCEECGKGFSRRADLKIHCRIHTGEKPYNCEECGKVFRQASNL
LAHQRVHSGEKPFKCEECGKSFGRSAHLQAHQKVHTGDKPYKCDECGKGFKWSLNLDMHQRVHTGEKPYK
CGECGKYFSQASSLQLHQSVHTGEKPYKCDVCGKVFSRSSQLQSHQRVHTGEKPYKCEICGKSFSWRSNL
TVHHRIHVGDKSYKSNRGGKNIRESTQEKKSIK.

在這個文件中我試圖尋找一個序列即: CDECGKEFSQGAHLQTHQKVH我必須在程序中硬編碼這個模式然后尋找它,我的代碼如下

    open FILE1, "file.fasta" or die;

while (my $line= <FILE1>) {

chomp $line;

}

if ($line =~ /CDECGKEFSQGAHLQTHQKVH/) {
    print "The protein contains the domain";
}else{
    print "The protein doesn't contain the domain";
}

現在這種模式出現在序列中,但我總是收到消息“蛋白質不包含域”。 我做錯了嗎?

$line 評估發生在循環之外......試一試:

my $found = 0;
open(my $fh, '<', 'file.fasta') or die "Could not open file '$filename': $!";
while (my $line = <$fh>) {
    if ($line =~ /CDECGKEFSQGAHLQTHQKVH/) {
        $found = 1;
        last;
    }
}

if ($found == 1) {
    print "The protein contains the domain";
} else {
    print "The protein doesn't contain the domain";
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM