[英]How to send many http request by sequentially in Node.js?
I am creating a crawler using Node.js. 我正在使用Node.js创建搜寻器。
In the target web page, there are 10+ categories. 在目标网页中,有10多个类别。
I can get them using my crawler. 我可以使用我的搜寻器来获取它们。
And I make requests for each category. 我针对每个类别提出要求。 (10+ requests)
(超过10个要求)
Then, each category page has 100+ items. 然后,每个类别页面有100多个项目。
And I make requests for each item. 我要求每个项目。 (100+ requests)
(超过100个要求)
So I need 10+ * 100+ requests! 所以我需要10+ * 100+个请求!
My code is here. 我的代码在这里。
const axios = require("axios")
const cheerio = require("cheerio");
async function request(url) {
return await axios.get(url);
}
function main() {
request(url).then(html => {
const $ = cheerio.load(html.data);
const categoryArray = $('table.table tbody').children('tr').toArray()
categoryArray.map(category => {
console.log("category: " + category.name)
request(category.url).then( html => {
const $ = cheerio.load(html.data);
const items = $('table.table tbody').children('tr').toArray()
console.log("item.length: " + items.length)
items.map(item => {
request(item).then(html => {
const $ = cheerio.load(html.data);
const itemDetails = $('table.table tbody').children('tr').toArray()
console.log("item.name: " + itemDetails.name)
})
})
})
})
})
}
But it doesn't work... 但这行不通...
The console.log looks like: console.log看起来像:
category: A
category: B
category: C
category: D
category: E
category: F
category: G
category: H
category: I
category: J
category: K
category: L
category: M
category: N
item.length: 0
item.length: 100
item.length: 100
item.length: 0
item.length: 100
item.length: 0
item.length: 0
item.length: 100
item.length: 0
item.length: 0
item.length: 0
item.length: 0
item.length: 0
item.length: 0
item.name: item1
(node:5409) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 2)
(node:5409) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
At the first time, it looks like working fine, but after some seconds, doesn't work. 第一次时,看起来工作正常,但是几秒钟后,它不起作用。
I think that "categoryArray.map" doesn't wait for children's requests. 我认为“ categoryArray.map”不会等待孩子的请求。
So the HTTP connection thread number is up to maximum. 因此,HTTP连接线程数最大。
I don't know how to fix it... 我不知道该如何解决...
Your problem is that Array.prototype.map
is not Promise
-aware, so it can't wait for your requests. 您的问题是
Array.prototype.map
不了解Promise
,因此它无法等待您的请求。
Instead of using map
, simply use async
/ await
and iterate arrays using for ... of
: 无需使用
map
,只需使用async
/ await
并使用for ... of
迭代数组:
async function main() {
const categoryArray = await request(categoryUrl)
for (const category of categoryArray) {
console.log("category: " + category.name)
const items = await request(category.url)
console.log("item.length: " + items.length)
for (const item of items) {
const itemDetails = await request(item)
console.log("item.name: " + itemDetails.name)
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.