Boost ASIO async_write_some確實很慢

Question

我終於找到了我的服務器的瓶頸，它原來是async_write和同樣的async_write_some 。

這里是下面的基准代碼：

struct timespec start, end;
clock_gettime(CLOCK_MONOTONIC, &start);

//boost::asio::async_write(mMainData.mSocket, boost::asio::buffer(pSendBuff->pBuffer, pSendBuff->dwUsedSize), mMainData.mStrand.wrap(boost::bind(&CServer::WriteHandler, pServer, this, pSendBuff, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred)));
mMainData.mSocket.async_write_some(boost::asio::buffer(pSendBuff->pBuffer, pSendBuff->dwUsedSize), (boost::bind(&CServer::WriteHandler, pServer, this, pSendBuff, boost::asio::placeholders::error, boost::asio::placeholders::bytes_transferred)));

clock_gettime(CLOCK_MONOTONIC, &end);

timespec temp;
if ((end.tv_nsec - start.tv_nsec) < 0)
{
    temp.tv_sec = end.tv_sec - start.tv_sec - 1;
    temp.tv_nsec = 1000000000 + end.tv_nsec - start.tv_nsec;
}
else
{
    temp.tv_sec = end.tv_sec - start.tv_sec;
    temp.tv_nsec = end.tv_nsec - start.tv_nsec;
}

pLogger->WriteToFile("./Logs/Benchmark_SendPacketP_AsyncWrite.txt", "dwDiff: %.4f\r\n", (float)temp.tv_nsec / 1000000.0f);

並輸出：

-[2016.05.21 03:45:19] dwDiff: 0.0552ms
-[2016.05.21 03:45:19] dwDiff: 0.0404ms
-[2016.05.21 03:45:19] dwDiff: 0.0542ms
-[2016.05.21 03:45:20] dwDiff: 0.0576ms

這真是太慢了，因為它是一個游戲服務器，我需要在房間頻道中廣播數據包，房間頻道在一個頻道中有300個玩家，想象一下它給我的玩家帶來的網絡延遲。

當然，該測試僅在服務器上由我自己完成。

是我的代碼有誤還是我在ASIO實現邏輯中缺少某些內容？

CXXFLAGS: -ggdb -ffunction-sections -Ofast -m64 -pthread -fpermissive -w -lboost_system -lboost_thread -Wall -fomit-frame-pointer
LDFLAGS: -Wl,-gc-sections -m64 -pthread -fpermissive -w -lboost_system -lboost_thread -lcurl

硬件為：Intel Xeon E3-1231v3（4核，8線程）64GB RAM 1GBPS上行鏈路

我將產生8名ASIO工人。

所以我正在使用調試器進入async_write內部，並發現了以下內容：

template <typename ConstBufferSequence, typename Handler>
void async_send(base_implementation_type& impl,
  const ConstBufferSequence& buffers,
  socket_base::message_flags flags, Handler& handler)
{
bool is_continuation =
  boost_asio_handler_cont_helpers::is_continuation(handler);

// Allocate and construct an operation to wrap the handler.
typedef reactive_socket_send_op<ConstBufferSequence, Handler> op;
typename op::ptr p = { boost::asio::detail::addressof(handler),
  boost_asio_handler_alloc_helpers::allocate(
    sizeof(op), handler), 0 };
p.p = new (p.v) op(impl.socket_, buffers, flags, handler);

BOOST_ASIO_HANDLER_CREATION((p.p, "socket", &impl, "async_send"));

start_op(impl, reactor::write_op, p.p, is_continuation, true,
    ((impl.state_ & socket_ops::stream_oriented)
      && buffer_sequence_adapter<boost::asio::const_buffer,
        ConstBufferSequence>::all_empty(buffers)));
p.v = p.p = 0;
}

為什么在應該是高性能的庫中將boost :: asio稱為“ new”？ 無論如何，要預創建它要分配的內容嗎？ 抱歉，在VMWare中運行GCC 4.8.5工具集的情況下，我無法使用Microsoft Visual Studio通過VisualGDB開發內部配置文件。

Answer 1

我知道這個答案有點晚了，但是如果有人覺得這個有用，我會發布它。

確實，發布完成處理程序時會調用new。 但是，官方文檔說明了如何優化，以通過實現自定義內存管理來避免相關的運行時開銷。 這是示例：完成處理程序的自定義內存管理

Answer 2

如果沒有探查器，則試圖確定哪個指令是瓶頸，這可能是徒勞的耐心測試。 創建一個最小的示例可能有助於確定特定環境中問題的根源。 例如，在一個既沒有針對I / O也不存在io_service爭用的受控方案中，當使用本機write()和Asio的async_write()時，我觀察到0.015ms〜的寫入。

試圖解決的問題是以最小的延遲將相同的消息寫入300個對等方。 一種解決方案可能是使問題並行化：與其使用單個作業將消息串行寫入300個對等體，不如考慮使用並行運行的n個作業並將消息串行寫入300/n對等體。 粗略估算：

如果連續執行300次寫操作，每個寫操作花費0.015毫秒（在受控環境中使用native write()時觀察到的平均值），則最終寫操作將在第一次寫操作后4.485毫秒開始。
如果根據潛在的並發限制（本例中為8）批處理300個寫入，則將有8個並行運行的作業連續執行38個寫入。 如果每次寫入花費0.0576ms（在實際系統上觀察到），則最終寫入將在第一次寫入之后2.13ms開始。

根據上述估計，通過並行處理問題，即使每個asyc_write操作花費的時間比預期的時間長，它也需要一半的時間寫入300個對等節點。 請記住，這些是粗略的估計，因此需要剖析以確定理想的並發量，並確定潛在的瓶頸。

Boost ASIO async_write_some確實很慢

問題描述

2 個解決方案

解決方案1
1 2016-09-18 19:06:05

解決方案2
0 2016-05-22 14:18:36

Boost ASIO async_write_some確實很慢

問題描述

2 個解決方案

解決方案1 1 2016-09-18 19:06:05

解決方案2 0 2016-05-22 14:18:36

解決方案1
1 2016-09-18 19:06:05

解決方案2
0 2016-05-22 14:18:36