简体   繁体   English

如何在提交表单后下载WWW :: Mechanize文件?

[英]How do I download a file with WWW::Mechanize after it submits a form?

I have the code: 我有代码:

#!/usr/bin/perl
use strict;
use WWW::Mechanize;

my $url = 'http://divxsubtitles.net/page_subtitleinformation.php?ID=111292';
my $m = WWW::Mechanize->new(autocheck => 1);
$m->get($url);
$m->form_number(2);
$m->click();
my $response = $m->res();
print $m->response->headers->as_string;

It submits the download button on the page, but I'm not sure how to download the file which is sent back after the POST. 它提交了页面上的下载按钮,但我不确定如何下载POST后发回的文件。

I'm wanting a way to download this with wget if possible. 如果可能的话,我想用wget下载这个。 I was thinking that their may be a secret url passed or something? 我以为他们可能是秘密网址传递的东西? Or will I have to download it with LWP directly from the response stream? 或者我是否必须直接从响应流中下载LWP?

So how do I download the file that is in that header? 那么如何下载该标题中的文件?

Thanks, 谢谢,

Cody Goodman 科迪古德曼

After submitting the form, you can use: 提交表单后,您可以使用:

$mech->save_content( $filename ) $ mech-> save_content($ filename)

Dumps the contents of $mech->content into $filename. 将$ mech-> content的内容转储到$ filename中。 $filename will be overwritten. $ filename将被覆盖。 Dies if there are any errors. 如果有任何错误,则死亡。

If the content type does not begin with "text/", then the content is saved in binary mode. 如果内容类型不以“text /”开头,则内容将以二进制模式保存。

Source: http://metacpan.org/pod/WWW::Mechanize 来源: http//metacpan.org/pod/WWW :::Mechanize

I tried your code and it returns a stack of HTML of which the only http:// references were: 我尝试了你的代码并返回一堆HTML,其中唯一的http://引用是:

http://www.w3c.org
    http://ad.z5x.net
    http://divxsubtitles.net
    http://feeds2read.net
    http://ad.z5x.net
    http://www.google-analytics.com
    http://cls.assoc-amazon.com
using the code 使用代码

 my $content = $m->response->content(); while ( $content =~ m{(http://[^/\\" \\t\\n\\r]+)}g ) { print( "$1\\n" ); } 

So my comments to you are: 所以我对你的评论是:
1. add use strict; 1.加use strict; to your code, you are programming for failure if you don't 对于你的代码,如果不这样做,你就会编程失败
2. read the output HTML and determine what to do next, you haven't done that, and therefore you've asked an incomplete question. 2.阅读输出HTML并确定接下来要做什么,你还没有这样做,因此你问了一个不完整的问题。 Unless you identify the URL you want to download you are asking somebody else to write a program for you. 除非确定要下载的URL,否则要求其他人为您编写程序。

Once you've identified the URL you want to download it is a simple matter of getting it and then writing the response content to a file. 一旦确定了要下载的URL,就可以轻松获取它,然后将响应内容写入文件。 eg 例如

 if ( ! open( FOUT, ">output.bin" ) ) { die( "Could not create file: $!" ); } binmode( FOUT ); # required for Windows print( FOUT $m->response->content() ); close( FOUT ); 

Well the thing that threw me off the most was the "mechanize->form_number" subroutine starts at 1 whereas typical programs start their index at 0. If anyone wants to know how to download response headers , or download header attachment s, this is the way to do it. 最让我失望的是“mechanize-> form_number”子程序从1开始,而典型的程序从0开始索引。如果有人想知道如何下载响应头 ,或下载头附件 s,这就是这样做的方式。

Now here's the full code to do what I wanted. 现在这里是我想要的完整代码。

#!/usr/bin/perl
use strict;
use WWW::Mechanize;

my $url = 'http://divxsubtitles.net/page_subtitleinformation.php?ID=111292';
my $m = WWW::Mechanize->new(autocheck => 1);
$m->get($url);
$m->form_number(2);
$m->click();
my $response = $m->res();
my $filename = $response->filename;

if (! open ( FOUT, ">$filename" ) ) {
    die("Could not create file: $!" );
}
print( FOUT $m->response->content() );
close( FOUT );

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM