简体   繁体   English

使用NPM搜寻器搜寻Node.js错误

[英]Node.js error crawling with npm crawler

When using the npm crawler to crawl links, I get the error: 使用npm搜寻器搜寻链接时,出现错误:

C:\Users\ryani\Desktop\JavaScript\crawler\crawler.js:15
                                $('a').each(function(index, value) {
                                      ^

TypeError: Cannot read property 'each' of undefined

I've tried setting timeouts and various debugging techniques... I'm not sure why it is getting undefined; 我尝试设置超时和各种调试技术...我不确定为什么它变得不确定。 when I put the code in a tag on an html page, it works fine. 当我将代码放在html页面上的标签中时,它可以正常工作。

crawler.js: crawler.js:

var Crawler = require("crawler");

var c = new Crawler({
    "maxConnections":10,

    "callback":function(error, res, $) {

        if (error) {
            console.log("error");
            console.log(error);
        } else {
            $('a').each(function(index, value) {
                console.log($(this).attr('href'));
                //c.queue(href)
            });
        }
    }
});

c.queue('http://www.google.com');

The problem is that you have not initialized Cheerio (var $ = res.$;). 问题在于您尚未初始化Cheerio(var $ = res。$;)。

Try this one it fetches all the likes from the provided URL... 试试这个,它可以从提供的网址中提取所有喜欢的内容...

var Crawler = require("crawler");

var c = new Crawler({
    maxConnections: 10,
    // This will be called for each crawled page 
    callback: function(error, res, done) {
        if (error) {
            console.log(error);
        } else {
            var $ = res.$;
            var links = [];

            $('a').each(function(i, elem) {
                links[i] = $(this).attr('href');
            });
            // $ is Cheerio by default 
            //a lean implementation of core jQuery designed specifically for the server 
            console.log(links);
        }
        done();
    }
});

c.queue('http://www.google.com');

I have never used the Crawler node module before but looking at their usage example found here . 我以前从未使用过Crawler节点模块,而是查看此处找到的用法示例。

The 3 parameters for the callback function are; 回调函数的3个参数是:

  1. error - The potential error that might returned by the crawler API 错误-搜寻器API可能返回的潜在错误
  2. res - The response object and $ is 1 of its property res-响应对象, $是其属性的1
  3. done - another callback function which the code will call when its done processing. done-完成处理后代码将调用的另一个回调函数。

By writing your code as callback :function(error, res, $) { the $ variable is in the 3rd position of parameter so essentially it is been used to represent as the done function. 通过将您的代码编写为callback :function(error, res, $) { $变量位于参数的第3位,因此从本质上讲,它被用作done函数。 Now you're saying done . 现在您done $ which is incorrect because the function object does not have that property and hence the error. $是不正确的,因为函数对象不具有该属性,因此会出现错误。

That is, your code should look something like: 也就是说,您的代码应类似于:

    res.$('a').each(function(i, elem) {
        links[i] = $(this).attr('href');
    });

Also, you will need to call the done parameter otherwise the process will just stuck there. 另外,您将需要调用done参数,否则该过程将停留在此处。 eg $() . 例如$() However I recommend you follow their code example as $ is not a good variable name. 但是,我建议您遵循其代码示例,因为$不是一个很好的变量名。 Recommend renaming your 3rd variable $ as done instead. 建议重新命名第三变量$done吧。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM