简体   繁体   English

延迟节点js中的每个循环迭代,async

[英]Delay each loop iteration in node js, async

I have the code below: 我有以下代码:

var request = require('request');
var cheerio = require ("cheerio");
var async= require("async");

var MyLink="www.mylink.com";

    async.series([

        function(callback){
            request(Mylink, function (error, response, body) {
                if (error) return callback(error); 
                var $ = cheerio.load(body);
                //Some calculations where I get NewUrl variable...
                TheUrl=NewUrl;
                callback();
            });
        },
        function(callback){
            for (var i = 0; i <=TheUrl.length-1; i++) {
                var url = 'www.myurl.com='+TheUrl[i];
                request(url, function(error, resp, body) { 
                    if (error) return callback(error); 
                    var $ = cheerio.load(body);
                    //Some calculations again...
                    callback();
                });
            };
        }
      ], function(error){
        if (error) return next(error);
    });

Does anyone have a suggestion about how I can delay each loop iteration in the for loop ? 有没有人建议如何延迟for loop中的每个循环迭代? Say, the code waits 10 seconds after each iteration is complete. 比如说,代码在每次迭代完成后等待10秒。 I tried setTimeout but didn't manage that to work. 我尝试了setTimeout但没有管理它。

You can set a timeout for the execution of the code at increasing intervals like this: 您可以按以下间隔设置执行代码的超时时间,如下所示:

var interval = 10 * 1000; // 10 seconds;

for (var i = 0; i <=TheUrl.length-1; i++) {
    setTimeout( function (i) {
        var url = 'www.myurl.com='+TheUrl[i];
        request(url, function(error, resp, body) { 
            if (error) return callback(error); 
            var $ = cheerio.load(body);
            //Some calculations again...
            callback();
        });
    }, interval * i, i);
}

So the first one runs right away (interval * 0 is 0), second one runs after ten seconds, etc. 所以第一个立即运行(间隔* 0为0),第二个运行十秒后运行等。

You need to send i as the final parameter in the setTimeout() so that its value is bound to the function argument. 您需要将i作为setTimeout()的最后一个参数发送,以便将其值绑定到函数参数。 Otherwise the attempt to access the array value will be out of bounds and you will get undefined . 否则,访问数组值的尝试将超出范围,您将得到undefined

Another alternative would be to use async.eachSeries . 另一种方法是使用async.eachSeries For example: 例如:

async.eachSeries(TheUrl, function (eachUrl, done) {
    setTimeout(function () {
        var url = 'www.myurl.com='+eachUrl;
        request(url, function(error, resp, body) { 
            if (error) return callback(error); 
            var $ = cheerio.load(body);
            //Some calculations again...
            done();
        });
    }, 10000);
}, function (err) {
    if (!err) callback();
});

Delaying multiple page fetches with async/await 使用async/await延迟多个页面提取

I am a big fan of the async library and I've used for a long time. 我是异步库的忠实粉丝,我已经使用了很长时间。 However, now there's async/await . 但是,现在有async/await Your code becomes easier to read. 您的代码变得更容易阅读。 For instance, this would be your main function: 例如,这将是您的主要功能:

const urls = await fetchUrls(INITIAL_URL);

for (const url of urls) {
    await sleep(10000);
    const $ = await fetchPage(url);
    // do stuff with cheerio-processed page
}

Much better, isn't it? 好多了,不是吗? Before I get into the details of how fetchPage() and fetchUrls() work, let's first answer your question of how to wait before fetching the next page. 在我深入了解fetchPage()fetchUrls()工作原理之前,让我们首先回答你在获取下一页之前如何等待的问题。 The sleep function is pretty straightforward: 睡眠功能非常简单:

async function sleep(millis) {
    return new Promise(resolve => setTimeout(resolve, millis));
}

You can get a full explanation of how it works in my other answer here . 您可以在此处获得有关其工作原理的完整说明。

Ok, back to the other functions. 好的,回到其他功能。 The request library has a promise-enabled version of it that you can use with async/await . request库具有启用了承诺的版本,您可以将其与async/await Let's check how's fetchPage() implemented: 我们来看看fetchPage()如何实现的:

async function fetchPage(url) {
    return await request({
        url: url,
        transform: (body) => cheerio.load(body)
    });
}

Since request is returning a promise, we can await on it. 由于request正在返回一个承诺,我们可以await它。 I also took the chance to use the transform property which allows us to tranform the response body before resolving the promise. 我也有机会使用transform属性,它允许我们在解析promise之前转换响应体。 I'm passing it through Cheerio, just like you did in your code. 我将它传递给Cheerio,就像你在代码中所做的那样。

Finally, fetchUrls() can just call fetchPage() and process it to fetch your array of URLs before resolving its promise. 最后, fetchUrls()可以在解析其promise之前调用fetchPage()并处理它以获取URL数组。 Here's the full code: 这是完整的代码:

const
    request = require("request-promise-native"),
    cheerio = require("cheerio");

const
    INITIAL_URL = "http://your-initial-url.com";

/**
 * Asynchronously fetches the page referred to by `url`.
 *
 * @param {String} url - the URL of the page to be fetched
 * @return {Promise} promise to a cheerio-processed page
 */
async function fetchPage(url) {
    return await request({
        url: url,
        transform: (body) => cheerio.load(body)
    });
}

/**
 * Your initial fetch which will bring the list of URLs your looking for.
 *
 * @param {String} initialUrl - the initial URL
 * @return {Promise<string[]>} an array of URL strings
 */
async function fetchUrls(initialUrl) {
    const $ = await fetchPage(initialUrl);
    // process $ here and get urls
    return ["http://foo.com", "http://bar.com"];
}

/**
 * Clever way to do asynchronous sleep. 
 * Check this: https://stackoverflow.com/a/46720712/778272
 *
 * @param {Number} millis - how long to sleep in milliseconds
 * @return {Promise<void>}
 */
async function sleep(millis) {
    return new Promise(resolve => setTimeout(resolve, millis));
}

async function run() {
    const urls = await fetchUrls(INITIAL_URL);
    for (const url of urls) {
        await sleep(10000);
        const $ = await fetchPage(url);
        // do stuff with cheerio-processed page
    }
}

run();

To use request with promises, install it like this: 要使用promises request ,请按以下方式安装:

npm install request
npm install request-promise-native

And then require("request-promise-native") in your code, like in the example above. 然后在代码中require("request-promise-native") ,如上例所示。

Since you're already using async , async.wilst would do nicely as a replacement for for . 由于你已经在使用asyncasync.wilst可以很好地替代for

whilst is an asynchronous while -like function. whilst是一个异步while样功能。 Each iteration is only run after the previous iteration has called its completion callback. 每次迭代仅在前一次迭代调用其完成回调之后运行。 In this case, we can simply postpone execution of the completion callback by 10 seconds with setTimeout . 在这种情况下,我们可以使用setTimeout简单地将完成回调的执行推迟10秒。

var i = 0;
async.whilst(
    // test to perform next iteration
    function() { return i <= TheUrl.length-1; },

    // iterated function
    // call `innerCallback` when the iteration is done
    function(innerCallback) {
        var url = 'www.myurl.com='+TheUrl[i];
        request(url, function(error, resp, body) { 
            if (error) return innerCallback(error); 
            var $ = cheerio.load(body);
            //Some calculations again...

            // wait 10 secs to run the next iteration
            setTimeout(function() { i++; innerCallback(); }, 10000);
        });
    },

    // when all iterations are done, call `callback`
    callback
);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM