使用WWW :: Mechanize :: Firefox下载网页

Question

I'm trying to scrape a website using WWW::Mechanize::Firefox , but whenever I try to get the data it is displaying JavaScript code and the data that I need is not there. 我正在尝试使用WWW::Mechanize::Firefox抓取一个网站，但是每当我尝试获取数据时，它就会显示JavaScript代码，而我所需的数据不在那儿。 If I inspect the element on Mozilla, the data that I need is there. 如果我检查Mozilla上的元素，则需要的数据就在那里。

Here's my current code: 这是我当前的代码：

#!/usr/bin/perl

use 5.010;
use strict;
use warnings;

use WWW::Mechanize::Firefox;

my $mech = WWW::Mechanize::Firefox->new();

$mech->get('link_goes_here');
$mech->allow( javascript => 0 );
$mech->content_encoding();
$mech->save_content('source.html');

Answer 1

Ok. 好。 So you have a page that builds its content using Javascript. 因此，您拥有一个使用Javascript构建其内容的页面。 Presumably, you have chosen to use WWW::Mechanize::Firefox instead of WWW::Mechanize because it includes support for rendering pages that are built using Javascript. 大概是因为您选择使用WWW :: Mechanize :: Firefox而不是WWW :: Mechanize，因为它支持使用Javascript构建的渲染页面。

And yet, when creating your Mechanize object, you explicitly turn off the Javascript support. 但是，在创建Mechanize对象时，您明确关闭了Javascript支持。

$mech->allow( javascript => 0 );

I can't test this theory because you haven't told us which URL you are using, but I bet you get a better result if you change that line to: 我无法检验该理论，因为您没有告诉我们您使用的是哪个URL，但是我敢打赌，如果将该行更改为：

$mech->allow( javascript => 1 );

使用WWW :: Mechanize :: Firefox下载网页

问题描述

1 个解决方案

解决方案1
3 2017-09-26 09:05:04

使用WWW :: Mechanize :: Firefox下载网页

问题描述

1 个解决方案

解决方案1 3 2017-09-26 09:05:04

解决方案1
3 2017-09-26 09:05:04