简体   繁体   English

使用Perl从文件中读取数据

[英]Reading data from a file using Perl

I have used the following code in perl to open and read a file contains protein data: 我在perl中使用了以下代码来打开并读取包含蛋白质数据的文件:

# !/usr/bin/perl -w

$proteinFileName = 'NM_021964fragment.pep';
open (PROTEINFILE, $proteinFileName);
$protein = <PROTEINFILE>;
close PROTEINFILE;

print "Here is the protein: \n";
print "$protein";
exit;

The problem that I am facing is that the provided code does not print out the data. 我面临的问题是提供的代码不会打印出数据。 The protein sequence data file 'NM_021964fragment.pep' is located in my desktop and even when I specified the location it does not read the file. 蛋白质序列数据文件'NM_021964fragment.pep'位于我的桌面上,即使我指定了它不读取文件的位置。

Any idea to get the code running? 有没有想过让代码运行?

First print a diag message when open fails. 首先在打开失败时打印诊断消息。 This is the Perl-way to do it: 这是Perl方式:

open PROTEINFILE, $proteinFileName or die $!;

Some questions: 一些问题:

  • What is the current working directory in which you start the Perl program? 启动Perl程序的当前工作目录是什么?
  • In case of your above example, you should cd to the directory where the pep file is located. 如果是上面的例子,你应该cd到pep文件所在的目录。
  • Do you start the script from the command line? 您是否从命令行启动脚本?
  • Does this happen to be Cygwin-Perl on a Windows machine? 这是在Windows机器上碰巧是Cygwin-Perl吗?

Next, the <> operator reads one line from a file. 接下来,<>运算符从文件中读取一行。 If the file has multiple lines and you want to read the whole file into a string, you need a loop to concatenate all lines into it: 如果文件有多行并且您想要将整个文件读入字符串,则需要一个循环来将所有行连接到其中:

$protein .= $_ while (<PROTEINFILE>);

I think the problem may be caused by that your perl script is not in the same directory with you protein file. 我认为问题可能是由于你的perl脚本与你的蛋白质文件不在同一个目录中。 you can load two of them together and try again. 你可以加载其中两个,然后再试一次。 if you want to input the path of file as a parameter, you can write like that below: 如果要输入文件路径作为参数,可以这样写:

@ARGV==1 || die"please input the file path";
open(IN, $ARGV[0]) || die "can't open the file $ARGV[0]:$!";
$protein = <PROTEINFILE>; ##get one line once, if you want to get all information, use while loop or like this: @protein = <PROTEINFILE>;
##something you like to do
close(IN);

Your code reads first line of the file. 您的代码读取文件的第一行。
Use File::Slurp module to read whole file into variable OR try the code below. 使用File::Slurp模块将整个文件读入变量尝试下面的代码。

# !/usr/bin/perl -w
use strict;
my $proteinFileName = 'NM_021964fragment.pep';
my $protein;
{ 
   local $/= undef;

   open (my $PROTEINFILE, '<', $proteinFileName) || die "Can't open";
   $protein = <$PROTEINFILE>;
   close $PROTEINFILE;
}

print "Here is the protein: \n";
print "$protein";
exit;

man perlvar : man perlvar

$/ The input record separator, newline by default. $ /输入记录分隔符,默认为换行符。 [...] You may set it [...] to "undef" to read through the end of file. [...]您可以将其设置为“undef”以读取文件末尾。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM