简体   繁体   English

如何在Node.js中按顺序发送许多http请求?

[英]How to send many http request by sequentially in Node.js?

I am creating a crawler using Node.js. 我正在使用Node.js创建搜寻器。

In the target web page, there are 10+ categories. 在目标网页中,有10多个类别。

I can get them using my crawler. 我可以使用我的搜寻器来获取它们。

And I make requests for each category. 我针对每个类别提出要求。 (10+ requests) (超过10个要求)

Then, each category page has 100+ items. 然后,每个类别页面有100多个项目。

And I make requests for each item. 我要求每个项目。 (100+ requests) (超过100个要求)

So I need 10+ * 100+ requests! 所以我需要10+ * 100+个请求!

My code is here. 我的代码在这里。

const axios = require("axios")
const cheerio = require("cheerio");

async function request(url) {
    return await axios.get(url);
}

function main() {
    request(url).then(html => {
        const $ = cheerio.load(html.data);
        const categoryArray = $('table.table tbody').children('tr').toArray()

        categoryArray.map(category => {
            console.log("category: " + category.name)

            request(category.url).then( html => {
                const $ = cheerio.load(html.data);
                const items = $('table.table tbody').children('tr').toArray()

                console.log("item.length: " + items.length)

                items.map(item => {
                    request(item).then(html => {
                        const $ = cheerio.load(html.data);
                        const itemDetails = $('table.table tbody').children('tr').toArray()

                        console.log("item.name: " + itemDetails.name)
                    })
                })
            })
        })
    })
}

But it doesn't work... 但这行不通...

The console.log looks like: console.log看起来像:

category: A
category: B
category: C
category: D
category: E
category: F
category: G
category: H
category: I
category: J
category: K
category: L
category: M
category: N
item.length: 0
item.length: 100
item.length: 100
item.length: 0
item.length: 100
item.length: 0
item.length: 0
item.length: 100
item.length: 0
item.length: 0
item.length: 0
item.length: 0
item.length: 0
item.length: 0
item.name: item1
(node:5409) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 2)
(node:5409) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

At the first time, it looks like working fine, but after some seconds, doesn't work. 第一次时,看起来工作正常,但是几秒钟后,它不起作用。

I think that "categoryArray.map" doesn't wait for children's requests. 我认为“ categoryArray.map”不会等待孩子的请求。

So the HTTP connection thread number is up to maximum. 因此,HTTP连接线程数最大。

I don't know how to fix it... 我不知道该如何解决...

Your problem is that Array.prototype.map is not Promise -aware, so it can't wait for your requests. 您的问题是Array.prototype.map不了解Promise ,因此它无法等待您的请求。

Instead of using map , simply use async / await and iterate arrays using for ... of : 无需使用map ,只需使用async / await并使用for ... of迭代数组:

async function main() {
    const categoryArray = await request(categoryUrl)
    for (const category of categoryArray) {
        console.log("category: " + category.name)

        const items = await request(category.url)
        console.log("item.length: " + items.length)

        for (const item of items) {
            const itemDetails = await request(item)
            console.log("item.name: " + itemDetails.name)
        }
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM