[英]Too many simultaneous requests with NodeJS+request-promise
I have NodeJS project with a BIG array (about 9000 elements) containing URLs.我有一个包含 URL 的大数组(大约 9000 个元素)的 NodeJS 项目。 Those URLs are going to be requested using the
request-promise
package.这些 URL 将使用
request-promise
package 来请求。 However, 9000 concurrent GET requests to the same website from the same client is neither liked by the server or the client, so I want to spread them out over time.但是,服务器或客户端都不喜欢来自同一客户端的同一网站的 9000 个并发 GET 请求,因此我想随着时间的推移将它们分散开来。 I have looked around a bit and found
Promise.map
together with the {concurrency: int}
option here , which sounded like it would do what I want.我环顾四周,发现
Promise.map
以及此处的{concurrency: int}
选项,这听起来像是我想要的。 But I cannot get it to work.但我无法让它工作。 My code looks like this:
我的代码如下所示:
const rp = require('request-promise'); var MongoClient = require('mongodb').MongoClient; var URLarray = []; //This contains 9000 URLs function getWebsite(url) { rp(url).then(html => { /* Do some stuff */ }).catch(err => { console.log(err) }); } MongoClient.connect('mongodb://localhost:27017/some-database', function (err, client) { Promise.map(URLArray, (url) => { db.collection("some-collection").findOne({URL: url}, (err, data) => { if (err) throw err; getWebsite(url, (result) => { if(result.= null) { console;log(result); } }), }: {concurrency; 1}); });
I think I probably misunderstand how to deal with promises.我想我可能误解了如何处理承诺。 In this scenario I would have thought that, with the concurrency option set to 1, each URL in the array would in turn be used in the database search and then passed as a parameter to
getWebsite
, whose result would be displayed in its callback function.在这种情况下,我会认为,将并发选项设置为 1,数组中的每个 URL 将依次用于数据库搜索,然后作为参数传递给
getWebsite
,其结果将显示在其回调 function 中。 THEN the next element in the array would be processed.然后将处理数组中的下一个元素。
What actually happens is that a few (maybe 10) of the URLs are fetch correctly, then the server starts to respond sporadically with 500 internal server error.实际发生的是一些(可能是 10 个)URL 被正确获取,然后服务器开始偶尔响应 500 内部服务器错误。 After a few seconds, my computer freezes and then restarts (which I guess is due to some kind of panic?).
几秒钟后,我的电脑死机然后重新启动(我猜这是由于某种恐慌?)。
How can I attack this problem?我该如何解决这个问题?
If the problem is really about concurrency, you can divide the work into chunks and chain the chunks.如果问题真的与并发有关,您可以将工作分成块并将这些块链接起来。
Let's start with a function that does a mongo lookup and a get....让我们从一个 function 开始,它执行 mongo 查找和获取....
// answer a promise that resolves to data from mongo and a get from the web
// for a given a url, return { mongoResult, webResult }
// (assuming this is what OP wants. the OP appears to discard the mongo result)
//
function lookupAndGet(url) {
// use the promise-returning variant of findOne
let result = {}
return db.collection("some-collection").findOne({URL: url}).then(mongoData => {
result.mongoData = mongoData
return rp(url)
}).then(webData => {
result.webData = webData
return result
})
}
lodash and underscore both offer a chunk method that breaks an array into an array of smaller. lodash和underscore都提供了一种块方法,可以将一个数组分解成一个更小的数组。 Write your own or use theirs.
编写您自己的或使用他们的。
const _ = require('lodash')
let chunks = _.chunk(URLArray, 5) // say 5 is a reasonable concurrency
Here's the point of the answer, make a chain of chunks so you only perform the smaller size concurrently...这是答案的重点,制作一个块链,这样你就只能同时执行较小的大小......
let chain = chunks.reduce((acc, chunk) => {
const chunkPromise = Promise.all(chunk.map(url => lookupAndGet(url)))
return acc.then(chunkPromise)
}, Promise.resolve())
Now execute the chain.现在执行链。 The chunk promises will return chunk-sized arrays of results, so your reduced result will be an array of arrays.
块承诺将返回块大小的 arrays 结果,因此您的缩减结果将是 arrays 数组。 Fortunately, lodash and underscore both have a method to "flatten" the nested array.
幸运的是,lodash 和 underscore 都有一种方法可以“展平”嵌套数组。
// turn [ url, url, ...] into [ { mongoResult, webResult }, { mongoResult, webResult }, ...]
// running only 5 requests at a time
chain.then(result => {
console.log(_.flatten(result))
})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.