简体   繁体   中英

Crawling with “npm crawler”

For example i what to crawl description of Node.js modules from npmjs.org .
but this code doesn't work. and how it made this with jQuery, but not with jsdom module.

var Crawler = require("crawler").Crawler;
var crawler = new Crawler({
   "maxConnections":10,
});

crawler.queue([{
"uri":"https://npmjs.org/package/crawler",

"callback":function(error,result) {
    console.log("description:", window.$("p.description").text());
    }
}]);

your code exists too early. Add a setTimeout on the last line to give enough time for your code to complete.

then call process.exit() from your callback function.

the crawler callback takes 3 parameters, the 3rd one being jQuery, so you probably use something like so:

"callback":function(error,result,$) {
  console.log("description:",$("p.description").text());
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM