简体   繁体   English

在javascript中进行网络爬虫错误:错误:连接ETIMEDOUT

[英]Web crawling in javascript Error: Error: connect ETIMEDOUT

Hi im having this error where im following a tutorial on how to web crawl using javascript.. but im getting this error when i execute it 嗨,我有此错误,我正在按照有关如何使用javascript进行网络抓取的教程进行操作。但是我执行该错误时会收到此错误

   Visiting page https://arstechnica.com/
                                                             testcrawl.js:6
Error: Error: connect ETIMEDOUT 50.31.169.131:443
                                                             testcrawl.js:9
TypeError: Cannot read property 'statusCode' of undefined
                                                             testcrawl.js:12

    at Request._callback (c:\Users\nab\practise\testcrawl.js:12:43)
    at self.callback (c:\Users\nab\node_modules\request\request.js:185:22)
    at emitOne (events.js:116:13)
    at Request.emit (events.js:211:7)
    at Request.onRequestError (c:\Users\nab\node_modules\request\request.js:877:8)
    at emitOne (events.js:116:13)
    at ClientRequest.emit (events.js:211:7)
    at TLSSocket.socketErrorListener (_http_client.js:387:9)
    at emitOne (events.js:116:13)
    at TLSSocket.emit (events.js:211:7)

these are the script im running 这些是我正在运行的脚本

var request = require('request');
var cheerio = require('cheerio');
var URL = require('url-parse');

var pageToVisit = "https://arstechnica.com/";
console.log("Visiting page " + pageToVisit);
request(pageToVisit, function(error, response, body) {
   if(error) {
     console.log("Error: " + error);
   }
   // Check status code (200 is HTTP OK)
   console.log("Status code: " + response.statusCode);
   if(response.statusCode === 200) {                      
     // Parse the document body
     var $ = cheerio.load(body);
     console.log("Page title:  " + $('title').text());
   }
});

why am i getting this error and how do i solve this issue? 为什么我会收到此错误?如何解决此问题?

Please try the following and let me know the result. 请尝试以下操作,并让我知道结果。

var pageToVisit = "https://arstechnica.com/";
console.log("Visiting page " + pageToVisit);
request({url:pageToVisit,timeout:20000}, function(error, response, body) {
   if(error) {
     console.log("Error: " + error);
   }
   // Check status code (200 is HTTP OK)
   console.log("Status code: " + response.statusCode);
   if(response.statusCode === 200) {                      
     // Parse the document body
     var $ = cheerio.load(body);
     console.log("Page title:  " + $('title').text());
   }
});

notice i have added a timeout to see how it goes. 请注意,我添加了timeout以查看进度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM