简体   繁体   English

使用Mozrepl使用WWW :: Mechanize :: FireFox创建缩略图 - 一些调试尝试

[英]create thumbnails with WWW::Mechanize::FireFox using Mozrepl - some debug attempts

well i run this script , which is written to do some screenshots of websites i have also up and running mozrepl 好吧,我运行这个脚本,这是为了做一些我已经运行mozrepl的网站的截图

here we have the file with some of the requested urls ... note this is only a short snippet of the real list - the real list is much much longer. 这里我们有一些请求的URL文件...请注意,这只是真实列表的一小段 - 真正的列表要长得多。 it contains more than 3500 lines and URLs 它包含超过3500行和URL

http://www.unifr.ch/sfm
http://www.zug.phz.ch
http://www.schwyz.phz.ch
http://www.luzern.phz.ch
http://www.schwyz.phz.ch
http://www.phvs.ch
http://www.phtg.ch
http://www.phsg.ch
http://www.phsh.ch
http://www.phr.ch
http://www.hepfr.ch/
http://www.phbern.ch
http://www.ph-solothurn.ch
http://www.pfh-gr.ch
http://www.ma-shp.luzern.phz.ch
http://www.heilpaedagogik.phbern.ch/

whats strange is the output - see below... question: should i do change the script 什么是奇怪的输出 - 见下文...问题:我应该更改脚本

why do i ge the output with the following little script: 为什么我使用以下小脚本输出:

#!/usr/bin/perl

use strict;
use warnings;
use WWW::Mechanize::Firefox;

my $mech = new WWW::Mechanize::Firefox();

open(INPUT, "<urls.txt") or die $!;

while (<INPUT>) {
        chomp;
        print "$_\n";
        $mech->get($_);
        my $png = $mech->content_as_png();
        my $name = "$_";
        $name =~s/^www\.//;
        $name .= ".png";
        open(OUTPUT, ">$name");
        print OUTPUT $png;
        sleep (5);
}

see here the well overwhelming output - to be frank i never have thught to get such a funny output .. i have to debug the whole code.... see below, 看到这里压倒性的输出 - 坦率地说,我从来没有得到如此有趣的输出..我必须调试整个代码....见下文,

http://www.unifr.ch/sfm
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 2.
http://www.zug.phz.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 3.
http://www.schwyz.phz.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 4.
http://www.luzern.phz.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 5.
http://www.schwyz.phz.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 6.
http://www.phvs.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 7.
http://www.phtg.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 8.
http://www.phsg.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 9.
http://www.phsh.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 10.
http://www.phr.ch
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 11.
http://www.hepfr.ch/
print() on closed filehandle OUTPUT at test_3.pl line 20, <INPUT> line 12.
http://www.phbern.ch                                                                                                                                                                  

Some musings: well -firstly, i think this is not a very serious error - i think i have to debug it and then it will work better. 一些想法:首先,我认为这不是一个非常严重的错误 - 我认为我必须调试它然后它会更好地工作。 Second, i firstly thought that the script seemed "to overload the machine"? 其次,我首先认为脚本似乎“超载机器”? Now i am not very sure about that: the symptoms do look strange but i guess that it is not neecessary to conclude an "overloading of the machine" 现在我不是很确定:症状确实看起来很奇怪,但我想结束“机器超载”并不是必要的

Third, well i think of certain steps that have to be taken to ensure that the problem is at all related to WWW::Mechanize::Firefox at all? 第三,我认为必须采取某些步骤来确保问题与WWW :: Mechanize :: Firefox完全相关? This leads me to the point to what Perl warning means and to the idea to use the diagnostics pragma to get more explanation: what do you think? 这让我想到了Perl警告意味着什么,以及使用诊断实用程序获得更多解释的想法:你怎么看?

print() on unopened filehandle FH at -e line 1 (#2) (W unopened) An I/O operation was attempted on a filehandle that w +as never initialized. 

firstly - we need to do an open(), a sysopen(), or a so +cket() call, or call a constructor from the FileHandle package besides that - alternatively, print() on closed filehandle OUTPUT also gives lots of answers that will tell us that we did not use autodie and also did not check the return value of open. 首先 - 我们需要做一个open(),一个sysopen()或一个so + cket()调用,或者从FileHandle包中调用一个构造函数 - 或者,封闭文件句柄上的print()OUTPUT也提供了很多答案这将告诉我们,我们没有使用autodie,也没有检查open的返回值。 Above all i have to debug it and make sure to find where the error comes into play[/QUOTE] 最重要的是我必须调试它并确保找到错误发挥作用[/ QUOTE]

But after some musings i think that it is worth to have a closer look at all test things-, what do you think about the idea always test to make sure the file is open before using it.That means that we should also get in the habit of using the three 但经过一些思考之后,我认为值得仔细研究一下所有的测试事项 - 你对这个想法有何看法, 总是先测试一下,确保文件在使用之前是开放的。这意味着我们也应该进入使用三者的习惯

arg open():

open my $fh, '>', $name or die "Can't open file $name : $!";
print $fh $stuff;

well - i guess that we can or should work around this without using die() , but we d have to manually have some method to let us know which files couldn't be created. 好吧 - 我想我们可以或者应该在不使用die()情况下解决这个问题,但我们必须手动设置一些方法让我们知道哪些文件无法创建。 In our case, it looks like all of them ....that are shown above... 在我们的例子中,看起来像所有这些 ......如上所示......

update in choosing a good file name you mean that i need to have a file name to store the images.. Note: i want to store all of them locally . 更新选择一个好的文件名你的意思是我需要一个文件名来存储图像..注意:我想在本地存储所有这些 But if i have a huge list of urls then i get a huge list of output files . 但如果我有一个巨大的网址列表,那么我得到一个巨大的输出文件列表。 Therefore i need to have good file names. 因此我需要有良好的文件名。 Can we reflect those things and needs in the programme!? 我们能否在计划中反映这些事情和需求!?

the very latest update ; 最新的更新 ; there seem to be some errors with mechanize .... i guess so!!! 机械化似乎有一些错误 ....我想是这样!!!

I you would like to print out binary data (jpg file), you have to set it explicitly. 我想打印出二进制数据(jpg文件),你必须明确地设置它。 Second, close a filehandler if you does not need it anymore and you 'or die' on open. 其次,如果您不再需要文件处理程序,请关闭文件处理程序,并在打开时“死”。 Third choose a good file name. 第三,选择一个好的文件名。

Regards, 问候,

http://perldoc.perl.org/functions/binmode.html http://perldoc.perl.org/functions/binmode.html

#!/usr/bin/perl

use strict;
use warnings;
use WWW::Mechanize::Firefox;

my $mech = new WWW::Mechanize::Firefox();

open(INPUT, "<urls.txt") or die $!;

while (<INPUT>) {
        chomp;
        next if $_ =~ m/http/i;
        print "$_\n";
        $mech->get($_);
        my $png = $mech->content_as_png();
        my $name = "$_";
        $name =~s#http://##is;
        $name =~s#/##gis;$name =~s#\s+\z##is;$name =~s#\A\s+##is;
        $name =~s/^www\.//;
        $name .= ".png";
        open(my $out, ">",$name) or die $!;
        binmode($out);
        print $out $png;
        close($out);
        sleep (5);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM