简体   繁体   English

如何使用Perl提供大型文件供下载?

[英]How do I serve a large file for download with Perl?

I need to serve a large file (500+ MB) for download from a location that is not accessible to the web server. 我需要提供一个大文件(500+ MB),以便从Web服务器无法访问的位置下载。 I found the question Serving large files with PHP , which is identical to my situation, but I'm using Perl instead of PHP. 我发现了使用PHP提供大文件的问题,这与我的情况相同,但我使用的是Perl而不是PHP。

I tried simply printing the file line by line, but this does not cause the browser to prompt for download before grabbing the entire file: 我尝试逐行打印文件,但这不会导致浏览器在抓取整个文件之前提示下载:

use Tie::File;

open my $fh, '<', '/path/to/file.txt';
tie my @file, 'Tie::File', $fh
    or die 'Could not open file: $!';
my $size_in_bytes = -s $fh;
print "Content-type: text/plain\n";
print "Content-Length: $size_in_bytes\n";
print "Content-Disposition: attachment; filename=file.txt\n\n";
for my $line (@file) {
    print $line;
}
untie @file;
close $fh;
exit;

Does Perl have an equivalent to PHP's readfile() function (as suggested with PHP) or is there a way to accomplish what I'm trying to do here? Perl是否具有与PHP的readfile()函数相同的功能(如PHP所示)或者有没有办法完成我在这里尝试做的事情?

If you just want to slurp input to output, this should do the trick. 如果你只想将输入粘贴到输出,这应该可以解决问题。

use Carp ();

{ #Lexical For FileHandle and $/ 
  open my $fh, '<' , '/path/to/file.txt' or Carp::croak("File Open Failed");
  local $/ = undef; 
  print scalar <$fh>; 
  close $fh or Carp::carp("File Close Failed");
}

I guess in response to the "Does Perl have a PHP ReadFile Equivelant" , and I guess my answer would be "But it doesn't really need one". 我想在回应“Perl是否有PHP ReadFile Equivelant”时,我想我的答案是“但它并不真的需要一个”。

I've used PHP's manual File IO controls and they're a pain, Perls are just so easy to use by comparison that shelling out for a one-size-fits-all function seems over-kill. 我已经使用过PHP的手动文件IO控件而且它们很痛苦,相比之下,Perls只是如此易于使用,因为一个适合所有人的功能似乎过度杀戮。

Also, you might want to look at X-SendFile support, and basically send a header to your webserver to tell it what file to send: http://john.guen.in/past/2007/4/17/send_files_faster_with_xsendfile/ ( assuming of course it has permissions enough to access the file, but the file is just NOT normally accessible via a standard URI ) 此外,您可能希望查看X-SendFile支持,并基本上向您的Web服务器发送一个标头,告诉它要发送的文件: http//john.guen.in/past/2007/4/17/send_files_faster_with_xsendfile/ (假设它当然具有足以访问该文件的权限,但该文件通常不能通过标准URI访问)

Edit Noted, it is better to do it in a loop, I tested the above code with a hard-drive and it does implicitly try store the whole thing in an invisible temporary variable and eat all your ram. 编辑注意到,最好做一个循环,我测试了一个硬盘驱动器上面的代码,它并含蓄地尝试存储在一个看不见的临时变量整个事情,吃所有您的RAM。

Alternative using blocks 替代使用块

The following improved code reads the given file in blocks of 8192 chars, which is much more memory efficient, and gets a throughput respectably comparable with my disk raw read rate. 以下改进的代码以8192个字符块的形式读取给定文件,这样可以提高内存效率,并且可以获得与我的磁盘原始读取速率相当的吞吐量。 ( I also pointed it at /dev/full for fits and giggles and got a healthy 500mb/s throughput, and it didn't eat all my rams, so that must be good ) (我还指出它/ dev / full适合和咯咯笑,并且获得了500mb / s的健康吞吐量,并且它没有吃掉我所有的公羊,所以一定要好)

{ 
    open my $fh , '<', '/dev/sda' ; 
    local $/ = \8192; # this tells IO to use 8192 char chunks. 
    print $_ while defined ( $_ = scalar <$fh> ); 
    close $fh; 
}

Applying jrockways suggestions 应用jrockways建议

{ 
    open my $fh , '<', '/dev/sda5' ; 
    print $_ while ( sysread $fh, $_ , 8192 ); 
    close $fh; 
}

This literally doubles performance, ... and in some cases, gets me better throughput than DD does O_o. 这实际上使性能提高了一倍......在某些情况下,我获得了比DD更好的吞吐量O_o。

The readline function is called readline (and can also be written as <> ). readline函数称为readline (也可以写为<> )。

I'm not sure what problem you're having. 我不确定你遇到了什么问题。 Perhaps that for loops aren't lazily evaluated (which they're not). 也许for循环不是懒惰的评价(他们不是)。 Or, perhaps Tie::File is screwing something up? 或者,也许Tie :: File搞砸了什么? Anyway, the idiomatic Perl for reading a file a line at a time is: 无论如何,用于一次读取一行文件的惯用Perl是:

open my $fh, '<', $filename or die ...;
while(my $line = <$fh>){
   # process $line
}

No need to use Tie::File. 无需使用Tie :: File。

Finally, you should not be handling this sort of thing yourself. 最后,你不应该自己处理这类事情。 This is a job for a web framework. 这是Web框架的工作。 If you were using Catalyst (or HTTP::Engine ), you would just say: 如果您使用的是Catalyst (或HTTP :: Engine ),您只需说:

open my $fh, '<', $filename ...
$c->res->body( $fh );

and the framework would automatically serve the data in the file efficiently. 并且框架将自动有效地提供文件中的数据。 (Using stdio via readline is not a good idea here, it's better to read the file in blocks from the disk. But who cares, it's abstracted!) (通过readline使用stdio在这里不是一个好主意,最好从磁盘中读取块中的文件。但是谁在乎,它是抽象的!)

You could use my Sys::Sendfile module. 您可以使用我的Sys :: Sendfile模块。 It's should be highly efficient (as it uses sendfile underneath the hood), but not entirely portable (only Linux, FreeBSD and Solaris are currently supported). 它应该是高效的(因为它在引擎盖下使用sendfile),但不完全可移植(目前仅支持Linux,FreeBSD和Solaris)。

Answering the (original) question ("Does Perl have an equivalent to PHP's readline() function ... ?"), the answer is "the angle bracket syntax": 回答(原始)问题(“Perl是否有相当于PHP的readline()函数......?”),答案是“尖括号语法”:

open my $fh, '<', '/path/to/file.txt';
while (my $line = <file>) {
    print $line;
}

Getting the content-length with this method isn't necessarily easy, though, so I'd recommend staying with Tie::File . 但是,使用此方法获取内容长度并不一定容易,因此我建议使用Tie::File


NOTE 注意

Using: 使用:

for my $line (<$filehandle>) { ... }

(as I originally wrote) copies the contents of the file to a list and iterates over that. (正如我最初写的那样)将文件的内容复制到列表中并对其进行迭代。 Using 运用

while (my $line = <$filehandle>) { ... }

does not. 才不是。 When dealing with small files the difference isn't significant, but when dealing with large files it definitely can be. 处理小文件时差异不大,但在处理大文件时肯定可以。


Answering the (updated) question ("Does Perl have an equivalent to PHP's readfile() function ... ?"), the answer is slurping . 回答(更新的)问题(“Perl是否有相当于PHP的readfile()函数......?”),答案正在悄悄解决 There are a couple of syntaxes , but Perl6::Slurp seems to be the current module of choice. 几种语法 ,但Perl6::Slurp似乎是当前的模块选择。

The implied question ("why doesn't the browser prompt for download before grabbing the entire file?") has absolutely nothing to do with how you're reading in the file, and everything to do with what the browser thinks is good form. 隐含的问题(“为什么在抓取整个文件之前浏览器没有提示下载?”)与你在文件中的阅读方式完全无关,而且与浏览器认为的好形式有关。 I would guess that the browser sees the mime-type and decides it knows how to display plain text. 我猜想浏览器会看到mime-type并决定它知道如何显示纯文本。


Looking more closely at the Content-Disposition problem, I remember having similar trouble with IE ignoring Content-Disposition. 更仔细地看看Content-Disposition问题,我记得在IE中忽略Content-Disposition有类似的麻烦。 Unfortunately I can't remember the workaround. 不幸的是我不记得解决方法了。 IE has a long history of problems here (old page, refers to IE 5.0, 5.5 and 6.0). IE在这里有很长的问题历史 (旧页面,指的是IE 5.0,5.5和6.0)。 For clarification, however, I would like to know: 但是,为了澄清,我想知道:

  1. What kind of link are you using to point to this big file (ie, are you using a normal a href="perl_script.cgi?filename.txt link or are you using Javascript of some kind)? 您使用什么样的链接指向这个大文件(即,您使用的是普通a href="perl_script.cgi?filename.txt链接还是使用某种类型的Javascript)?

  2. What system are you using to actually serve the file? 您使用什么系统来实际提供文件? For instance, does the webserver make its own connection to the other computer without a webserver, and then copy the file to the webserver and then send the file to the end user, or does the user make the connection directly to the computer without a webserver? 例如,网络服务器是否在没有网络服务器的情况下与其他计算机建立自己的连接,然后将文件复制到网络服务器,然后将文件发送给最终用户,或者用户是否在没有网络服务器的情况下直接连接到计算机?

  3. In the original question you wrote "this does not cause the browser to prompt for download before grabbing the entire file" and in a comment you wrote "I still don't get a download prompt for the file until the whole thing is downloaded." 在最初的问题中,您写道“这不会导致浏览器在抓取整个文件之前提示下载”,并且在评论中您写道“在下载整个文件之前,我仍然没有获得该文件的下载提示”。 Does this mean that the file gets displayed in the browser (since it's just text), that after the browser has downloaded the entire file you get a "where do you want to save this file" prompt, or something else? 这是否意味着文件在浏览器中显示(因为它只是文本),在浏览器下载完整个文件后,您会得到“您要在哪里保存此文件”提示符,或其他内容?

I have a feeling that there is a chance the HTTP headers are getting stripped out at some point or that a Cache-control header is getting added (which apparently can cause trouble). 我有一种感觉,HTTP标头有可能在某些时候被剥离,或者一个Cache-control标头被添加(这显然会导致麻烦)。

When you say "this does not cause the browser to prompt for download" -- what's "the browser"? 当你说“这不会导致浏览器提示下载” - 什么是“浏览器”?

Different browsers behave differently, and IE is particularly wilful, it will ignore headers and decide for itself what to do based on reading the first few kb of the file. 不同的浏览器行为不同,IE特别有意,它会忽略标头并根据读取文件的前几个kb自行决定做什么。

In other words, I think your problem may be at the client end, not the server end. 换句话说,我认为您的问题可能出在客户端,而不是服务器端。

Try lying to "the browser" and telling it the file is of type application/octet-stream. 试着撒谎到“浏览器”并告诉它该文件是application / octet-stream类型。 Or why not just zip the file, especially as it's so huge. 或者为什么不直接压缩文件,特别是因为它太大了。

Don't use for/foreach (<$input>) because it reads the whole file at once and then iterates over it. 不要使用for/foreach (<$input>)因为它一次读取整个文件然后迭代它。 Use while (<$input>) instead. 改为使用while (<$input>) The sysread solution is good, but the sendfile is the best performance-wise. sysread解决方案很好,但sendfile是性能最佳的。

I've successfully done it by telling the browser it was of type application/octet-stream instead of type text/plain. 我通过告诉浏览器它是application / octet-stream类型而不是text / plain类型来成功完成它。 Apparently most browsers prefer to display text/plain inline instead of giving the user a download dialog option. 显然大多数浏览器更喜欢显示文本/纯内联而不是为用户提供下载对话框选项。

It's technically lying to the browser, but it does the job. 它在技术上对浏览器撒谎,但它完成了这项工作。

The most efficient way to serve a large file for download depends on a web-server you use. 提供大型文件以供下载的最有效方法取决于您使用的Web服务器。

In addition to @Kent Fredric X-Sendfile suggestion : 除了@Kent Fredric X-Sendfile建议

File Downloads Done Right have some links that describe how to do it for Apache , lighttpd (mod_secdownload: security via url generation), nginx . 文件下载完成右边有一些链接描述如何为Apachelighttpd (mod_secdownload:通过url生成的安全性), nginx There are examples in PHP, Ruby (Rails), Python which can be adopted for Perl. PHP中有一些例子,Ruby(Rails),Python可以用于Perl。

Basically it boils down to: 基本上它归结为:

  1. Configure paths, and permissions for your web-server. 配置Web服务器的路径和权限。
  2. Generate valid headers for the redirect in your Perl app ( Content-Type , Content-Disposition , Content-length ? , X-Sendfile or X-Accel-Redirect , etc). 为Perl应用程序中的重定向生成有效标头( Content-TypeContent-DispositionContent-length X-SendfileX-Accel-Redirect等)。

There are probably CPAN modules, web-frameworks plugins that do exactly that eg, @Leon Timmermans mentioned Sys::Sendfile in his answer . 可能有CPAN模块,网络框架插件就是这样做的,例如@Leon Timmermans在他的回答中提到了Sys::Sendfile

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM