如何使用Perl的WWW :: Mechanize登录和下载文件？

Question

I'm trying to use Perl's WWW::Mechanize to download a file. 我正在尝试使用Perl的WWW :: Mechanize下载文件。 I have to login the website before and then, after having validated the form, download the file. 我必须先登录网站，然后在验证表格后下载文件。

The thing is, after hours, I didn't succeed doing what I want. 事情是，几个小时后，我没有成功做自己想做的事。 At the end, the script save a file which is not a zip file but a html file with nothing interesting in it. 最后，脚本会保存一个文件，该文件不是zip文件，而是一个html文件，没有任何有趣的内容。

Here is the script I've done : 这是我完成的脚本：

use WWW::Mechanize;
use Crypt::SSLeay;

my $login = "MyMail";
my $password = "MyLogin";
my $url = 'http://www.lemonde.fr/journalelectronique/donnees/protege/20101002/Le_Monde_20101002.zip';

$bot = WWW::Mechanize->new();
$bot->cookie_jar(
    HTTP::Cookies->new(
        file           => "cookies.txt",
        autosave       => 1,
        ignore_discard => 1,
    )
);

$response = $bot->get($url);

$bot->form_name("formulaire");
$bot->field('login', $login);
$bot->field('password', $password);
$bot->submit();

$response = $bot->get($url);
my $filename = $response->filename;

if (! open ( FOUT, ">$filename" ) ) {
    die("Could not create file: $!" );
}
print( FOUT $bot->response->content() );
close( FOUT );

Could you help me finding what mistakes I've done? 您能帮我发现我犯了什么错误吗？

Answer 1

There are some hidden input fields which I assume are filled in when you navigate to the download using a browser rather than using a URL directly. 当您使用浏览器（而不是直接使用URL）导航到下载内容时，我假设会填写一些隐藏的输入字段。

In addition, they are setting some cookies via JavaScript and those would not be picked up by Mechanize. 另外，他们通过JavaScript设置了一些cookie，而Mechanize不会将其拾取。 However, there is a plugin WWW::Mechanize::Plugin::JavaScript which might be able to help you with that (I have no experience with it). 但是，有一个插件WWW :: Mechanize :: Plugin :: JavaScript可能会帮助您（我没有经验）。

Use LiveHTTPHeaders to see what gets submitted by the browser and replicate that (assuming you are not violating their TOS). 使用LiveHTTPHeaders可以查看浏览器提交的内容并进行复制（假设您没有违反其TOS）。

Answer 2

The problem you mention is well known in Mechanize. 您提到的问题在Mechanize中是众所周知的。 The simplest solution is to use the Raspo library. 最简单的解决方案是使用Raspo库。

如何使用Perl的WWW :: Mechanize登录和下载文件？

问题描述

2 个解决方案

解决方案1
3 2010-10-06 14:20:54

解决方案2
-4 2010-10-06 13:12:19

如何使用Perl的WWW :: Mechanize登录和下载文件？

问题描述

2 个解决方案

解决方案1 3 2010-10-06 14:20:54

解决方案2 -4 2010-10-06 13:12:19

解决方案1
3 2010-10-06 14:20:54

解决方案2
-4 2010-10-06 13:12:19