performance.now 將 onnxruntime 用於順序和並行執行模式

Question

我在 NodeJS 中使用Onnxruntime在 cpu 后端執行 onnx 轉換模型。 我使用 Promise.allSettled 並行運行 model 推理：

var promises = sequences.map(seq => self.inference(self.session, self.tokenizer, seq));
results = (await Promise.allSettled(promises)).filter(p => p.status === "fulfilled").map(p => p.value);

運行調用 static 方法Util.performance.now的 class 實例方法inference

  ONNX.prototype.inference = async function (session, tokenizer, text) {
        const default_labels = this._options.model.default_labels;
        const labels = this._options.model.labels;
        const debug = this._options.debug;
        try {
            const encoded_ids = await tokenizer.tokenize(text);
            if (encoded_ids.length === 0) {
                return [0.0, default_labels];
            }
            const model_input = ONNX.create_model_input(encoded_ids);
            const start = Util.performance.now();
            const output = await session.run(model_input, ['output_0']);
            const duration = Util.performance.now(start).toFixed(1);

            const sequence_length = model_input['input_ids'].size;
            if (debug) console.log("latency = " + duration + "ms, sequence_length=" + sequence_length);
            const probs = output['output_0'].data.map(ONNX.sigmoid).map(t => Math.floor(t * 100));

            const result = [];
            for (var i = 0; i < labels.length; i++) {
                const t = [labels[i], probs[i]];
                result[i] = t;
            }
            result.sort(ONNX.sortResult);

            const result_list = [];
            for (i = 0; i < 6; i++) {
                result_list[i] = result[i];
            }
            return [parseFloat(duration), result_list];
        } catch (e) {
            return [0.0, default_labels];
        }
    }//inference

時機不對，總結一下。 performance object 看起來像

Util = {
   performance: {
      now: function (start) {
          if (!start) {
              return process.hrtime();
          }
          var end = process.hrtime(start);
          return Math.round((end[0] * 1000) + (end[1] / 1000000));
      }
  }
}

它以通常的方式使用

// this runs parallel
const start = Util.performance.now();
// computation
const duration = (Util.performance.now() - start).toFixed(1);

現在，在performance樂趣中， start和end變量 scope 是本地變量，那么使用Promise.allSettled會發生什么？ 由於本地 scope，我希望時間是正確的。

Answer 1

計時機制是正確的，但是當session.run時，它將啟動一些異步 API（非 JavaScript），同時已經返回掛起的 promise object。這將允許其他inference執行也調用session.run ，從而導致a state，其中異步 API 正在同時處理多個此類請求，因此在inference的多個執行上下文中計算相同的時間片。 這些請求甚至可能在非常接近的時刻結束。 當發生這種情況時，一個接一個的inference執行將繼續執行結束計時的代碼（設置它們的duration變量）。 很明顯，這些持續時間可以而且很可能會相互重疊。

要將其可視化為 3 次執行inference ，您可以這樣做：

   start-----------------------------start+duration
      start-----------------------------start+duration
         start------------------------------start+duration

 time ----->

如果你不想要這種並行性，你不應該幾乎同時進行所有inference調用，而是等待每個下一次執行，直到前一個被解決：

for (let seq of sequences) {
     let value = await self.inference(self.session, self.tokenizer, seq));
     // ...
}

這樣session.run的異步部分的執行不會重疊，也不會因為並發而影響性能。 我希望時間安排更像這樣：

   start---------------start+duration
                       start-------------start+duration
                                         start------------start+duration

 time ----->

現在，時間片不會被多次計算，即使整個過程的總持續時間可能會更長。

請注意，該行為與Promise.allSettled ，因為您無論如何都會獲得這些輸出，即使您從程序中刪除了Promise.allSettled調用。

performance.now 將 onnxruntime 用於順序和並行執行模式

問題描述

1 個解決方案

解決方案1
1 已采納 2022-05-04 15:10:36

performance.now 將 onnxruntime 用於順序和並行執行模式

問題描述

1 個解決方案

解決方案1 1 已采納 2022-05-04 15:10:36

解決方案1
1 已采納 2022-05-04 15:10:36