简体   繁体   English

async_receive_from在Linux下的几个数据包后停止接收

[英]async_receive_from stops receiving after a few packets under Linux

I have a setup with multiple peers broadcasting udp packets (containing images) every 200ms (5fps). 我有一个设置,多个对等体每200毫秒(5fps)广播udp数据包(包含图像)。

While receiving both the local stream as external streams works fine under Windows, the same code (except for the socket->cancel(); in Windows XP, see comment in code) produces rather strange behavior under Linux: 虽然接收本地流作为外部流在Windows下工作正常,但相同的代码(在Windows XP中, socket->cancel();请参阅代码中的注释)在Linux下产生相当奇怪的行为:

  • The first few (5~7) packets sent by another machine (when this machine starts streaming) are received as expected; 其他机器发送的前几个(5~7)数据包(当本机开始流式传输时)按预期接收;
  • After this, the packets from the other machine are received after irregular, long intervals (12s, 5s, 17s, ...) or get a time out (defined after 20 seconds). 在此之后,来自另一台机器的数据包在不规则的长间隔(12s,5s,17s,......)之后被接收或者获得超时(在20秒之后定义)。 At certain moments, there is again a burst of (3~4) packets received as expected. 在某些时刻,再次按预期接收到突发的(3~4个)分组。
  • The packets sent by the machine itself are still being received as expected. 机器本身发送的数据包仍按预期接收。

Using Wireshark, I see both local as external packets arriving as they should, with correct time intervals between consecutive packages. 使用Wireshark,我看到本地作为外部数据包到达它们应该,连续包之间的正确时间间隔。 The behavior also presents itself when the local machine is only listening to a single other stream, with the local stream disabled. 当本地计算机仅侦听单个其他流并且禁用本地流时,该行为也会出现。

This is some code from the receiver (with some updates as suggested below, thanks!): 这是来自接收器的一些代码(如下所示,有一些更新,谢谢!):

Receiver::Receiver(port p)
{
  this->port = p;
  this->stop = false;
}

int Receiver::run()
{
  io_service io_service;
  boost::asio::ip::udp::socket socket(
    io_service,
    boost::asio::ip::udp::endpoint(boost::asio::ip::udp::v4(),
    this->port));
  while(!stop)
  {
    const int bufflength = 65000;
    int timeout = 20000;
    char sockdata[bufflength];
    boost::asio::ip::udp::endpoint remote_endpoint;
    int rcvd;

    bool read_success = this->receive_with_timeout(
           sockdata, bufflength, &rcvd, &socket, remote_endpoint, timeout);

    if(read_success)
    {
      std::cout << "read succes " << remote_endpoint.address().to_string() << std::endl;
    }
    else
    {
      std::cout << "read fail" << std::endl;
    }
  }
  return 0;
}

void handle_receive_from(
  bool* toset, boost::system::error_code error, size_t length, int* outsize)
{
  if(!error || error == boost::asio::error::message_size)
  {
    *toset = length>0?true:false;
    *outsize = length;
  }
  else
  {
    std::cout << error.message() << std::endl;
  }
}

// Update: error check
void handle_timeout( bool* toset, boost::system::error_code error)
{
  if(!error)
  {
    *toset = true;
  }
  else
  {
    std::cout << error.message() << std::endl;
  }
}

bool Receiver::receive_with_timeout(
  char* data, int buffl, int* outsize,
  boost::asio::ip::udp::socket *socket,
  boost::asio::ip::udp::endpoint &sender_endpoint, int msec_tout)
{
  bool timer_overflow = false;
  bool read_result = false;

  deadline_timer timer( socket->get_io_service() );

  timer.expires_from_now( boost::posix_time::milliseconds(msec_tout) );
  timer.async_wait( boost::bind(&handle_timeout, &timer_overflow,
    boost::asio::placeholders::error) );

  socket->async_receive_from(
    boost::asio::buffer(data, buffl), sender_endpoint,
    boost::bind(&handle_receive_from, &read_result,
    boost::asio::placeholders::error,
    boost::asio::placeholders::bytes_transferred, outsize));

  socket->get_io_service().reset();

  while ( socket->get_io_service().run_one())
  {
    if ( read_result )
    {
      timer.cancel();
    }
    else if ( timer_overflow )
    {
      //not to be used on Windows XP, Windows Server 2003, or earlier
      socket->cancel();
      // Update: added run_one()
      socket->get_io_service().run_one();
    }
  }
  // Update: added run_one()
  socket->get_io_service().run_one();
  return read_result;
}

When the timer exceeds the 20 seconds, the error message "Operation canceled" is returned, but it is difficult to get any other information about what is going on. 当计时器超过20秒时,将返回错误消息“取消操作”,但很难获得有关正在发生的事情的任何其他信息。

Can anyone identify a problem or give me some hints to get some more information about what is going wrong? 任何人都可以识别问题或给我一些提示,以获得有关出错的更多信息吗? Any help is appreciated. 任何帮助表示赞赏。

Okay, what you're doing is that when you call receive_with_timeout , you're setting up the two asynchronous requests (one for the recv, one for the timeout). 好的,你正在做的是当你调用receive_with_timeout ,你正在设置两个异步请求(一个用于recv,一个用于超时)。 When the first one completes, you cancel the other. 当第一个完成时,您取消另一个。

However, you never invoke ioservice::run_one() again to allow it's callback to complete. 但是,您永远不会再次调用ioservice::run_one()以允许它的回调完成。 When you cancel an operation in boost::asio, it invokes the handler, usually with an error code indicating that the operation has been aborted or canceled. 取消boost :: asio中的操作时,它会调用处理程序,通常会显示一个错误代码,指示操作已被中止或取消。 In this case, I believe you have a handler dangling once you destroy the deadline service, since it has a pointer onto the stack for it to store the result. 在这种情况下,我相信你有一个处理程序悬挂一旦你销毁截止日期服务,因为它有一个指向堆栈的指针,以便存储结果。

The solution is to call run_one() again to process the canceled callback result prior to exiting the function. 解决方案是再次调用run_one()以在退出函数之前处理已取消的回调结果。 You should also check the error code being passed to your timeout handler, and only treat it as a timeout if there was no error. 您还应该检查传递给超时处理程序的错误代码,并且只在没有错误时将其视为超时。

Also, in the case where you do have a timeout, you need to execute run_one so that the async_recv_from handler can execute, and report that it was canceled. 此外,在您确实有超时的情况下,您需要执行run_one以便async_recv_from处理程序可以执行,并报告它已被取消。

After a clean installation with Xubuntu 12.04 instead of an old install with Ubuntu 10.04, everything now works as expected. 使用Xubuntu 12.04进行全新安装而不是使用Ubuntu 10.04进行旧安装后,现在一切都按预期工作。 Maybe it is because the new install runs a newer kernel, probably with improved networking? 也许是因为新的安装运行了一个更新的内核,可能改进了网络? Anyway, a re-install with a newer version of the distribution solved my problem. 无论如何,使用更新版本的发行版重新安装解决了我的问题。

If anyone else gets unexpected network behavior with an older kernel, I would advice to try it on a system with a newer kernel installed. 如果其他人使用较旧的内核获得意外的网络行为,我建议在安装了较新内核的系统上进行尝试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM