I have list of genes to download from the following links. the problem that it's separated into 60 pages, under the drop-down list.
http://diana.cslab.ece.ntua.gr/micro-CDS/index.php?r=search/results_mature&mir=hsa-miR-3131&kwd=MIMAT0014996
How can I make WWW::Mechanize access all the genes from all the pages?
This is the current code I have:
use WWW::Mechanize;
use strict;
use warnings;
my $url = "http://diana.cslab.ece.ntua.gr/micro-CDS/index.php?r=search/results_mature&mir=hsa-miR-3131&kwd=MIMAT0014996";
my $mech = WWW::Mechanize->new();
$mech->agent_alias("Windows IE 6");
$mech->get($url);
#only access the first page.
The page drop-down is implemented using Javascript. You can't do this with Mechanize, because it doesn't implement Javascript. See the FAQ
This is easy -- the page number is inside URL (this is for page #11):
my $page_number = 11;
$mech->get( "http://diana.cslab.ece.ntua.gr/micro-CDS/index.php?r=search%2Finitializesearch&keywords=MIMAT0014996&thr=0.41&kegg=&page=" . $page_number );
$pages = 60;
for($i=1;$pages<=60;$i++){
$url = "http://diana.cslab.ece.ntua.gr/micro-CDS/index.php?r=search%2Finitializesearch&keywords=MIMAT0014996&thr=0.41&kegg=&page=$i"
$mech->get($url);
}
This should do it. You just need to iterate through the 60 pages, modifying the URL each time.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.