简体   繁体   English

节点JS请求获取ETIMEDOUT'IP地址'

[英]Node js request Getting ETIMEDOUT 'ip address'

Her's what i am doing in the code 她就是我在代码中所做的

i am Reading a text file with around 3500 links then i am reading each link , filtering the one i want and doing a request to get the status code, link, and page title (using cheerio). 我正在阅读包含大约3500个链接的文本文件,然后正在阅读每个链接,过滤所需的链接,并进行请求以获取状态代码,链接和页面标题(使用cheerio)。 after around the looping the 100th or 200th link i get the "connect ETIMEDOUT 40...:443". 在绕过第100或200个链接后,我得到“ connect ETIMEDOUT 40 ...:443”。 links look good. 链接看起来不错。 Whats going on here? 这里发生了什么? is the web server kicking me out thinking its is a DDOS?, i am doing this for a company i work for and this is not the intention obviously. 是Web服务器让我以为它是DDOS吗?我正在为我工​​作的公司这样做,这显然不是目的。 if any of you want to test with large amount links , i used https://hackertarget.com/extract-links/ to get the links then put it in a text file. 如果您想使用大量链接进行测试,我会使用https://hackertarget.com/extract-links/来获取链接,然后将其放在文本文件中。

Here is my code 这是我的代码

 var request = require('request'); var cheerio = require('cheerio'); var URL = require('url-parse'); var axios = require('axios'); const fs = require('fs'); const readline = require('readline'); var main = []; var linkdata = []; const rl = readline.createInterface({ input: fs.createReadStream('C:/Users/Jay/Documents/Javascript/crawl/links.txt'), crlfDelay: Infinity }); rl.on('line', (link) => { if (link.startsWith('https://www.example.com')) { var encodeLink = encodeURI(link) request(encodeURI(encodeLink), function (error, response, body) { console.log("Link: ",encodeLink) if (error) { console.log("Error:Request " + error); } // Check status code (200 is HTTP OK) if (response.statusCode === 200) { // Parse the document body var $ = cheerio.load(body); var Status_200 = { "status Code": response.statusCode, "Page title:": $('title').text(), "Original Link": encodeLink, } main.push(Status_200) } if (response.statusCode === 302 || response.statusCode === 404 || response.statusCode === 500) { // Parse the document body var Status_Errors = { "status Code": response.statusCode, "Page title:": $('title').text(), "Original Link": encodeLink, } main.push(Status_Errors) } //console.log(JSON.stringify(main)) fs.writeFile("C:/Users/Jay/Documents/Javascript/crawl/output.json", JSON.stringify(main), (err) => { if (err) console.log(err); console.log("Successfully Written to File."); }); }) } }); 

Put a try catch around since using async to see if that helps with memory error you getting, probably good practice anyway 尝试尝试一下,因为使用异步来看看这是否有助于解决您遇到的内存错误,无论如何可能都是好的做法

 try { const body = response.data; if (response.status === 200) { //do ur thing } if (response.status === 302 || response.status === 404 || response.status === 500) { // Parse the document body //do ur thing } fs.writeFile("C:/Users/T440/Documents/crawl/output.json", JSON.stringify(main), (err) => { if (err) console.log(err); console.log("Successfully Written to File."); }); } catch (error) { //catch them erros } main.push(Status_ErrorsCatch) 

With some suggestions from the comments i slowed down the process with readline async iterator structure as well as using axios for more promise friendly 根据评论中的一些建议,我通过readline异步迭代器结构以及使用axios来简化承诺速度,从而减慢了该过程

Here is sample of how i fixed the ETIMEDOUT 'ip address' issue, i am having memmory issue now but the original problem is solved i think 这是我如何解决ETIMEDOUT'ip address'问题的示例,我现在有内存问题,但原来的问题已解决,我认为

 async function processLineByLine() { const rl = readline.createInterface({ input: fs.createReadStream('C:/Users/T440/Documents/crawl/links.txt'), crlfDelay: Infinity }); for await (const line of rl) { if (line.startsWith('https://www.example.com')) { var encodeLink = encodeURI(line); const response = await axios.get(encodeLink).catch((err)=>{ 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM