How to download HTML encoded with PHP/JavaScript content using WGET or Perl

Question

I have a URL which I want to download and parse:

http://diana.cslab.ece.ntua.gr/micro-CDS/index.php?r=search/results_mature&mir=hsa-miR-3131&kwd=MIMAT0014996

The problem is when I download with unix wget the following way:

$ wget [the above url]

It gave me the content which is different with those I saw over the browser (namely, the list of genes was not there).

What's the right way to do it programatically?

Answer 1

I've just tested using PHP and its pulling it with the genes list just fine

<?php
echo file_get_contents('http://diana.cslab.ece.ntua.gr/micro-CDS/index.php?r=search/results_mature&mir=hsa-miR-3131&kwd=MIMAT0014996');
?>

do you have access to PHP

Answer 2

#/usr/bin/perl

use WWW::Mechanize;
use strict;
use warnings;

my $url = "http://diana.cslab.ece.ntua.gr/micro-CDS/index.php?r=search/results_mature&mir=hsa-miR-3131&kwd=MIMAT0014996";

my $mech = WWW::Mechanize->new();
$mech->agent_alias("Windows IE 6");

$mech->get($url);
#now you have access to the HTML code via $mech->content();

To process HTML code I'm strongly recommend to use HTML::TreeBuilder::XPath (or other HTML parsing module)

How to download HTML encoded with PHP/JavaScript content using WGET or Perl

Question

2 answers

solution1
1 2013-04-18 05:21:10

solution2
1 ACCPTED 2013-04-18 05:21:29

How to download HTML encoded with PHP/JavaScript content using WGET or Perl

Question

2 answers

solution1 1 2013-04-18 05:21:10

solution2 1 ACCPTED 2013-04-18 05:21:29

solution1
1 2013-04-18 05:21:10

solution2
1 ACCPTED 2013-04-18 05:21:29