简体   繁体   English

如何在生成器 function 中并行执行一些异步任务?

[英]How can I execute some async tasks in parallel with limit in generator function?

I'm trying to execute some async tasks in parallel with a limitation on the maximum number of simultaneously running tasks.我正在尝试并行执行一些异步任务,同时限制同时运行的任务的最大数量

There's an example of what I want to achieve:有一个我想要实现的例子:

任务流示例

Currently this tasks are running one after another.目前,这些任务正在一个接一个地运行。 It's implemented this way:它是这样实现的:

export function signData(dataItem) {
  cadesplugin.async_spawn(async function* (args) {
    //... nestedArgs assignment logic ...

    for (const id of dataItem.identifiers) {
      yield* idHandler(dataItem, id, args, nestedArgs);
    }
    
    // some extra logic after all tasks were finished
  }, firstArg, secondArg);
}

async function* idHandler(edsItem, researchId, args, nestedArgs) {
  ...
  let oDocumentNameAttr = yield cadesplugin.CreateObjectAsync("CADESCOM.CPAttribute");
  yield oDocumentNameAttr.propset_Value("Document Name");
  ...
  // this function mutates some external data, making API calls and returns void
}

Unfortunately, I can't make any changes in cadesplugin.* functions, but I can use any external libraries (or built-in Promise ) in my code.不幸的是,我无法对cadesplugin.*函数进行任何更改,但我可以在我的代码中使用任何外部库(或内置Promise )。

I found some methods ( eachLimit and parallelLimit ) in async library that might work for me and an answer that shows how to deal with it.我在异步库中找到了一些可能对我有用的方法( eachLimitparallelLimit )以及一个显示如何处理它的答案

But there are still two problems I can't solve:但是还有两个问题我无法解决:

  1. How can I pass main params into nested function?如何将主要参数传递给嵌套的 function?
  2. Main function is a generator function , so I still need to work with yield expressions in main and nested functions Main function 是一个生成器 function ,所以我仍然需要在 main 和嵌套函数中使用yield表达式

There's a link to cadesplugin.* source code , where you can find async_spawn (and another cadesplugin.* ) function that used in my code.有一个cadesplugin.* source code的链接,您可以在其中找到我的代码中使用的async_spawn (和另一个cadesplugin.* )function。

That's the code I tried with no luck:那是我没有运气尝试的代码:

await forEachLimit(dataItem.identifiers, 5, yield* async function* (researchId, callback) { 
  //... nested function code 
});

It leads to Object is not async iterable error.它导致Object is not async iterable错误。

Another attempt:另一种尝试:

let functionArray = [];
dataItem.identifiers.forEach(researchId => {
  functionArray.push(researchIdHandler(dataItem, id, args, nestedArgs))
});
await parallelLimit(functionArray, 5);

It just does nothing.它什么也不做。

Сan I somehow solve this problem, or the generator functions won't allow me to do this? Сan 我以某种方式解决了这个问题,或者生成器函数不允许我这样做?

square peg, round hole方钉,圆孔

You cannot use async iterables for this problem.你不能使用异步迭代来解决这个问题。 It is the nature of for await.. of to run in series .这是for await.. of的本质是串联运行。 await blocks and the loop will not continue until the awaited promise has resovled. await块,并且在等待的 promise 解决之前,循环不会继续。 You need a more precise level of control where you can enforce these specific requirements.您需要更精确的控制级别,您可以在其中强制执行这些特定要求。

To start, we have a mock myJob that simulates a long computation.首先,我们有一个模拟长计算的myJob More than likely this will be a network request to some API in your app -这很可能是对您应用程序中某些 API 的网络请求 -

// any asynchronous task
const myJob = x =>
  sleep(rand(5000)).then(_ => x * 10)

Using Pool defined in this Q&A , we instantiate Pool(size=4) where size is the number of concurrent threads to run -使用此 Q&A中定义的Pool ,我们实例化Pool(size=4)其中size是要运行的并发线程数 -

const pool = new Pool(4)

For ergonomics, I added a run method to the Pool class, making it easier to wrap and run jobs -为了人机工程学,我在Pool class 中添加了一个run方法,使其更容易包装和运行作业 -

class Pool {
  constructor (size) ...
  open () ...
  deferNow () ...
  deferStacked () ...

  // added method
  async run (t) {
    const close = await this.open()
    return t().then(close)
  }
}

Now we need to write an effect that uses our pool to run myJob .现在我们需要编写一个使用我们的pool来运行myJob的效果。 Here you will also decide what to do with the result.在这里,您还将决定如何处理结果。 Note the promise must be wrapped in a thunk otherwise pool cannot control when it begins -请注意 promise必须包装在一个 thunk 中,否则池无法控制它何时开始 -

async function myEffect(x) {
  // run the job with the pool
  const r = await pool.run(_ => myJob(x))

  // do something with the result
  const s = document.createTextNode(`${r}\n`)
  document.body.appendChild(s)

  // return a value, if you want
  return r
}

Now run everything by mapping myEffect over your list of inputs.现在通过将myEffect映射到您的输入列表来运行所有内容。 In our example myEffect we return r which means the result is also available after all results are fetched.在我们的示例myEffect ,我们return r ,这意味着在获取所有结果后结果也可用。 This optional but demonstrates how program knows when everything is done -这是可选的,但演示了程序如何知道一切何时完成 -

Promise.all([1,2,3,4,5,6,7,8,9,10,11,12].map(myEffect))
  .then(JSON.stringify)
  .then(console.log, console.error)

full program demo完整程序演示

In the functioning demo below, I condensed the definitions so we can see them all at once.在下面的功能演示中,我压缩了定义,以便我们可以一次看到它们。 Run the program to verify the result in your own browser -运行程序在您自己的浏览器中验证结果 -

 class Pool { constructor (size = 4) { Object.assign(this, { pool: new Set, stack: [], size }) } open () { return this.pool.size < this.size? this.deferNow(): this.deferStacked() } async run (t) { const close = await this.open(); return t().then(close) } deferNow () { const [t, close] = thread(); const p = t.then(_ => this.pool.delete(p)).then(_ => this.stack.length && this.stack.pop().close()); this.pool.add(p); return close } deferStacked () { const [t, close] = thread(); this.stack.push({ close }); return t.then(_ => this.deferNow()) } } const rand = x => Math.random() * x const effect = f => x => (f(x), x) const thread = close => [new Promise(r => { close = effect(r) }), close] const sleep = ms => new Promise(r => setTimeout(r, ms)) const myJob = x => sleep(rand(5000)).then(_ => x * 10) async function myEffect(x) { const r = await pool.run(_ => myJob(x)) const s = document.createTextNode(`${r}\n`) document.body.appendChild(s) return r } const pool = new Pool(4) Promise.all([1,2,3,4,5,6,7,8,9,10,11,12].map(myEffect)).then(JSON.stringify).then(console.log, console.error)

slow it down放慢速度

Pool above runs concurrent jobs as quickly as possible.上面的Pool尽可能快地运行并发作业。 You may also be interested in throttle which is also introduced in the original post.您可能还对原始帖子中介绍的throttle感兴趣。 Instead of making Pool more complex, we can wrap our jobs using throttle to give the caller control over the minimum time a job should take -我们可以使用throttle来包装我们的作业,而不是让Pool变得更复杂,以让调用者控制作业应该花费的最短时间 -

const throttle = (p, ms) =>
  Promise.all([ p, sleep(ms) ]).then(([ value, _ ]) => value)

We can add a throttle in myEffect .我们可以在myEffect中添加一个throttle Now if myJob runs very quickly, at least 5 seconds will pass before the next job is run -现在,如果myJob运行得非常快,则在运行下一个作业之前至少要经过 5 秒 -

async function myEffect(x) {
  const r = await pool.run(_ => throttle(myJob(x), 5000))
  const s = document.createTextNode(`${r}\n`)
  document.body.appendChild(s)
  return r
}

In general, it should be better to apply @Mulan answer .一般来说,应用@Mulan answer应该会更好。

But if you also stuck into cadesplugin.* generator functions and don't really care about heavyweight external libraries, this answer may also be helpful.但是,如果您还坚持cadesplugin.*生成器函数并且并不真正关心重量级外部库,那么这个答案也可能会有所帮助。

(If you are worried about heavyweight external libraries, you may still mix this answer with @Mulan's one) (如果您担心重量级的外部库,您仍然可以将此答案与@Mulan 的答案混合使用)

Async task running can simply be solved using Promise.map function from bluebird library and double-usage of cadesplugin.async_spawn function. Async task running can simply be solved using Promise.map function from bluebird library and double-usage of cadesplugin.async_spawn function.

The code will look like the following:代码将如下所示:

export function signData(dataItem) {
  cadesplugin.async_spawn(async function* (args) {
    // some extra logic before all of the tasks

    await Promise.map(dataItem.identifiers,
      (id) => cadesplugin.async_spawn(async function* (args) {
        // ...
        let oDocumentNameAttr = yield cadesplugin.CreateObjectAsync("CADESCOM.CPAttribute");
        yield oDocumentNameAttr.propset_Value("Document Name");
        // ...
        // this function mutates some external data and making API calls
      }),
      {
        concurrency: 5 //Parallel tasks count
      });
    
    // some extra logic after all tasks were finished
  }, firstArg, secondArg);
}

The magic comes from async_spawn function which is defined as:魔法来自async_spawn function 定义为:

function async_spawn(generatorFunction) {
  async function continuer(verb, arg) {
    let result;
    try {
      result = await generator[verb](arg);
    } catch (err) {
      return Promise.reject(err);
    }
    if (result.done) {
      return result.value;
    } else {
      return Promise.resolve(result.value).then(onFulfilled, onRejected);
    }
  }

  let generator = generatorFunction(Array.prototype.slice.call(arguments, 1));
  let onFulfilled = continuer.bind(continuer, "next");
  let onRejected = continuer.bind(continuer, "throw");
  return onFulfilled();
}

It can suspend the execution of internal generator functions on yield expressions without suspending the whole generator function.它可以暂停对yield表达式的内部生成器函数的执行,而无需暂停整个生成器 function。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM