简体   繁体   English

NodeJS+request-promise 的并发请求太多

[英]Too many simultaneous requests with NodeJS+request-promise

I have NodeJS project with a BIG array (about 9000 elements) containing URLs.我有一个包含 URL 的大数组(大约 9000 个元素)的 NodeJS 项目。 Those URLs are going to be requested using the request-promise package.这些 URL 将使用request-promise package 来请求。 However, 9000 concurrent GET requests to the same website from the same client is neither liked by the server or the client, so I want to spread them out over time.但是,服务器或客户端都不喜欢来自同一客户端的同一网站的 9000 个并发 GET 请求,因此我想随着时间的推移将它们分散开来。 I have looked around a bit and found Promise.map together with the {concurrency: int} option here , which sounded like it would do what I want.我环顾四周,发现Promise.map以及此处{concurrency: int}选项,这听起来像是我想要的。 But I cannot get it to work.但我无法让它工作。 My code looks like this:我的代码如下所示:

 const rp = require('request-promise'); var MongoClient = require('mongodb').MongoClient; var URLarray = []; //This contains 9000 URLs function getWebsite(url) { rp(url).then(html => { /* Do some stuff */ }).catch(err => { console.log(err) }); } MongoClient.connect('mongodb://localhost:27017/some-database', function (err, client) { Promise.map(URLArray, (url) => { db.collection("some-collection").findOne({URL: url}, (err, data) => { if (err) throw err; getWebsite(url, (result) => { if(result.= null) { console;log(result); } }), }: {concurrency; 1}); });

I think I probably misunderstand how to deal with promises.我想我可能误解了如何处理承诺。 In this scenario I would have thought that, with the concurrency option set to 1, each URL in the array would in turn be used in the database search and then passed as a parameter to getWebsite , whose result would be displayed in its callback function.在这种情况下,我会认为,将并发选项设置为 1,数组中的每个 URL 将依次用于数据库搜索,然后作为参数传递给getWebsite ,其结果将显示在其回调 function 中。 THEN the next element in the array would be processed.然后将处理数组中的下一个元素。

What actually happens is that a few (maybe 10) of the URLs are fetch correctly, then the server starts to respond sporadically with 500 internal server error.实际发生的是一些(可能是 10 个)URL 被正确获取,然后服务器开始偶尔响应 500 内部服务器错误。 After a few seconds, my computer freezes and then restarts (which I guess is due to some kind of panic?).几秒钟后,我的电脑死机然后重新启动(我猜这是由于某种恐慌?)。

How can I attack this problem?我该如何解决这个问题?

If the problem is really about concurrency, you can divide the work into chunks and chain the chunks.如果问题真的与并发有关,您可以将工作分成块并将这些块链接起来。

Let's start with a function that does a mongo lookup and a get....让我们从一个 function 开始,它执行 mongo 查找和获取....

// answer a promise that resolves to data from mongo and a get from the web
// for a given a url, return { mongoResult, webResult }
// (assuming this is what OP wants. the OP appears to discard the mongo result)
//
function lookupAndGet(url) {
  // use the promise-returning variant of findOne
  let result = {}
  return db.collection("some-collection").findOne({URL: url}).then(mongoData => {
    result.mongoData = mongoData
    return rp(url) 
  }).then(webData => {
    result.webData = webData
    return result
  })
}

lodash and underscore both offer a chunk method that breaks an array into an array of smaller. lodashunderscore都提供了一种块方法,可以将一个数组分解成一个更小的数组。 Write your own or use theirs.编写您自己的或使用他们的。

const _ = require('lodash')
let chunks = _.chunk(URLArray, 5)  // say 5 is a reasonable concurrency

Here's the point of the answer, make a chain of chunks so you only perform the smaller size concurrently...这是答案的重点,制作一个块链,这样你就只能同时执行较小的大小......

let chain = chunks.reduce((acc, chunk) => {
  const chunkPromise = Promise.all(chunk.map(url => lookupAndGet(url)))
  return acc.then(chunkPromise)
}, Promise.resolve())

Now execute the chain.现在执行链。 The chunk promises will return chunk-sized arrays of results, so your reduced result will be an array of arrays.块承诺将返回块大小的 arrays 结果,因此您的缩减结果将是 arrays 数组。 Fortunately, lodash and underscore both have a method to "flatten" the nested array.幸运的是,lodash 和 underscore 都有一种方法可以“展平”嵌套数组。

// turn [ url, url, ...] into [ { mongoResult, webResult }, { mongoResult, webResult }, ...]
// running only 5 requests at a time
chain.then(result => {
  console.log(_.flatten(result))
})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM