簡體   English   中英

liburing:io_uring_submit() 放在 await_suspend 中時會導致錯誤

[英]liburing: io_uring_submit() causes error when placed in await_suspend

我目前正在嘗試 C++ 協程抽象掉 io_uring。 為此,我有以下 class:

class io_service final {
public:
   explicit io_service(unsigned size, threadpool& pool) : pool_(pool) {
      if (auto ret = io_uring_queue_init(size, &ring_, 0); ret < 0) {
         throw std::runtime_error{"Liburing error!"};
      }
   }

   ~io_service() {
      io_uring_queue_exit(&ring_);
   }

   void message_pump() {
      io_uring_cqe* cqe = nullptr;

      while (true) {
         auto ret = io_uring_wait_cqe(&ring_, &cqe);
         auto* data = static_cast<io_result*>(io_uring_cqe_get_data(cqe));


         if (ret < 0) {
            std::cerr << "Fatal error in io_uring_wait_cqe!\n";
            throw std::runtime_error{"Fatal error in io_uring_wait_cqe!"};
         }

         if (cqe->res < 0) {
            std::cerr << "Error while doing an asynchronous request: " 
                      << -cqe->res << " (" << strerror(-cqe->res) << ")\n";
            throw std::runtime_error{"Error while doing an asynchronous request : " 
                        + std::string(strerror(-cqe->res))};
         }

         data->status_code = cqe->res;

         pool_.push_task([handle = data->handle] { handle.resume(); });
         io_uring_cqe_seen(&ring_, cqe);
      }
   }

   [[nodiscard]] auto accept_async(int socket, sockaddr_in& in, socklen_t& socket_length) {
      return uring_awaitable{
         &ring_, 
         io_result::operation_type::accept, 
         io_uring_prep_accept,
         socket, 
         reinterpret_cast<sockaddr*>(&in), 
         &socket_length, 
         0
      };
   }

  
private:
   struct uring_awaiter {
      io_uring* ring_;
      io_uring_sqe* entry;
      io_result request_data{};

      explicit uring_awaiter(io_result::operation_type op_type, io_uring* ring, io_uring_sqe* sqe) : ring_(ring), entry(sqe), request_data{op_type} {}

      [[nodiscard]] bool await_ready() const noexcept { return false; }

      void await_suspend(std::coroutine_handle<> handle) noexcept {
         request_data.handle = handle;
         io_uring_sqe_set_data(entry, &request_data);



         // SUBMITTING HERE LATER CAUSES ERRORS ==============================
         io_uring_submit(ring_);
         // ==================================================================



      }

      [[nodiscard]] int await_resume() const noexcept {
         return request_data.status_code;
      }
   };

   class uring_awaitable {
   public:
      template <typename F, typename... Args>
         requires requires(F f) { f(std::declval<io_uring_sqe*>(), std::declval<Args>()...); }
      uring_awaitable(io_uring* ring, io_result::operation_type op, F function, Args&&... args)
         : ring_(ring), sqe_(io_uring_get_sqe(ring_)), op_(op) {
         function(sqe_, std::forward<Args>(args)...);
      }

      auto operator co_await() const {
         return uring_awaiter{op_, ring_, sqe_};
      }

   private:
      io_uring* ring_;
      io_uring_sqe* sqe_;
      io_result::operation_type op_;
   };

   io_uring ring_{};
   bool interrupted_ = false;
   threadpool& pool_;
};

這個 class 應該像這樣使用:

threadpool p{};
io_service s{128, p};

// In another thread, later
co_await s.accept_async(/* ... */);

如上面的代碼片段所示,當我將io_uring_submit放入await_resume()時會出現問題。 然后我得到 output“執行異步請求時出錯:125(操作已取消)”。 但是,如果我將message_pump()更改為類似這樣的內容(並從await_resume()中刪除提交):

void message_pump() {
   using namespace std::chrono_literals;

   io_uring_cqe* cqe = nullptr;

   while (true) {
      // SUBMITTING HERE ==================================================
      std::this_thread::sleep_for(1s);
      io_uring_submit(&ring_);
      // ==================================================================

      auto ret = io_uring_wait_cqe(&ring_, &cqe);
      auto* data = static_cast<io_result*>(io_uring_cqe_get_data(cqe));

      if (ret < 0) {
         std::cerr << "Fatal error in io_uring_wait_cqe!\n";
         throw std::runtime_error{"Fatal error in io_uring_wait_cqe!"};
      }

      if (cqe->res < 0) {
         std::cerr << "Error while doing an asynchronous request: " << -cqe->res << " (" << strerror(-cqe->res) << ")\n";
         throw std::runtime_error{"Error while doing an asynchronous request : " + std::string(strerror(-cqe->res))};
      }

      data->status_code = cqe->res;

      pool_.push_task([handle = data->handle] { handle.resume(); });
      io_uring_cqe_seen(&ring_, cqe);
   }
}

現在一切都按預期工作。 顯然,這不是做事的正確方法。

為什么第一種方法不起作用?

通過 kernel 任務完成的操作必須使用調用io_uring_submit()的線程。 這意味着線程無法在 kernel 中完成 cqe 之前終止。如果您從動態線程池提交 sqes,則可能會丟失完成。

我不是 100% 確定 accept 使用 kernel 任務或者這種情況返回-ECANCEL但因此我不得不切換到專用線程以提交 uring_cmds。

liburing 功能請求“從任何線程提交請求”中的建議是讓一個線程提交或每個線程都有自己的環(滾動到最底部)。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM