使用“ npm爬行器”爬行

Question

For example i what to crawl description of Node.js modules from npmjs.org . 例如，我要从npmjs.org抓取Node.js模块的描述。
but this code doesn't work. 但是此代码不起作用。 and how it made this with jQuery, but not with jsdom module. 以及它是如何使用jQuery而不是jsdom模块实现的。

var Crawler = require("crawler").Crawler;
var crawler = new Crawler({
   "maxConnections":10,
});

crawler.queue([{
"uri":"https://npmjs.org/package/crawler",

"callback":function(error,result) {
    console.log("description:", window.$("p.description").text());
    }
}]);

Answer 1

your code exists too early. 您的代码存在太早。 Add a setTimeout on the last line to give enough time for your code to complete. 在最后一行添加setTimeout，以提供足够的时间来完成代码。

then call process.exit() from your callback function. 然后从您的回调函数调用process.exit（）。

the crawler callback takes 3 parameters, the 3rd one being jQuery, so you probably use something like so: 搜寻器回调函数需要3个参数，第3个参数是jQuery，因此您可能会使用如下所示的参数：

"callback":function(error,result,$) {
  console.log("description:",$("p.description").text());
}

使用“ npm爬行器”爬行

问题描述

1 个解决方案

解决方案1
1 已采纳 2013-02-05 23:40:52

使用“ npm爬行器”爬行

问题描述

1 个解决方案

解决方案1 1 已采纳 2013-02-05 23:40:52

解决方案1
1 已采纳 2013-02-05 23:40:52