Why can't our packet sniffer receive all of the replayed TCP packets?

Question

We are trying to replay a pcap file (smallFlows.pcap) over a 10 GbE connection using tcpreplay and capture all the packets, recording the source and destination ports/IP addresses. However, there is significant packet loss. At 3 Gbps, we are losing around 15% of the packets sent. Even at 1 Gbps, we are losing 7%. Our sniffer program is written in C using netmap-libpcap and is a modified version of sniffex.c .

We removed all the print statements when testing. We tried changing the snap length and buffer size, but that only slightly improved the packet loss rate. We also set the CPU cores on both the sender and receiver to performance mode to maximize the clock speeds (around 2.67 GHz on the receiver), but that had no effect. According to top, the CPU usage was fairly low - around 15%.

The receiver has an Intel Core i7 processor. The sender is running Ubuntu 12.04.3 LTS (linux kernel 3.8.13) and the receiver is running Ubuntu 12.04 (linux kernel 3.2.0-23-generic).

What can we do to ensure that all the packets are received?

Here is the main function:

int main(int argc, char **argv)
{

  char *dev = NULL;         /* capture device name */
  char errbuf[PCAP_ERRBUF_SIZE];        /* error buffer */
  pcap_t *handle;               /* packet capture handle */

  char filter_exp[] = "ip";     /* filter expression [3] */
  bpf_u_int32 mask;         /* subnet mask */
  bpf_u_int32 net;          /* ip */
  int num_packets = 10;         /* number of packets to capture */

  print_app_banner();
  printf(pcap_lib_version());
  /* check for capture device name on command-line */
  if (argc == 2) {
    dev = argv[1];
  }
  else if (argc > 2) {
    fprintf(stderr, "error: unrecognized command-line options\n\n");
    print_app_usage();
    exit(EXIT_FAILURE);
  }
  else {
    /* find a capture device if not specified on command-line */
    dev = pcap_lookupdev(errbuf);
    if (dev == NULL) {
        fprintf(stderr, "Couldn't find default device: %s\n",
            errbuf);
        exit(EXIT_FAILURE);
    }
  }

  /* get network number and mask associated with capture device */
  if (pcap_lookupnet(dev, &net, &mask, errbuf) == -1) {
        fprintf(stderr, "Couldn't get netmask for device %s: %s\n",
        dev, errbuf);
    net = 0;
    mask = 0;
  }

  /* print capture info */
  printf("Device: %s\n", dev);
  printf("Number of packets: %d\n", num_packets);
  printf("Filter expression: %s\n", filter_exp);


  /* open capture device */
  //handle = pcap_open_live(dev, SNAP_LEN, 1, 1000, errbuf);
  handle = pcap_create(dev, errbuf);
  if (handle == NULL) {
    fprintf(stderr, "Couldn't open device %s: %s\n", dev, errbuf);
    exit(EXIT_FAILURE);
  }

  pcap_set_snaplen(handle, 1518);
  pcap_set_promisc(handle, 1);
  pcap_set_timeout(handle, 1000);
  pcap_set_buffer_size(handle, 20971520);
  pcap_activate(handle);


  /* make sure we're capturing on an Ethernet device [2] */
  if (pcap_datalink(handle) != DLT_EN10MB) {
    fprintf(stderr, "%s is not an Ethernet\n", dev);
    exit(EXIT_FAILURE);
  } 

  /* now we can set our callback function */
  pcap_loop(handle, 0/*num_packets*/, got_packet, NULL);

  /* cleanup */
  pcap_close(handle);

  printf("\nCapture complete.\n");

  return 0;
}

Here is the packet handler code called by pcap_loop():

/*
* dissect packet
*/
void got_packet(u_char *args, const struct pcap_pkthdr *header, const u_char *packet)
{

  static int count = 1;                   /* packet counter */

  /* declare pointers to packet headers */
  const struct sniff_ethernet *ethernet;  /* The ethernet header [1] */
  const struct sniff_ip *ip;              /* The IP header */
  const struct sniff_tcp *tcp;            /* The TCP header */
  const char *payload;                    /* Packet payload */

  int size_ip;
  int size_tcp;
  int size_payload;

  //printf("\nPacket number %d:\n", count);
  count++;
  //if(count >= 2852200)
  printf("count: %d\n", count);
  /* define ethernet header */
  ethernet = (struct sniff_ethernet*)(packet);

  /* define/compute ip header offset */
  ip = (struct sniff_ip*)(packet + SIZE_ETHERNET);
  size_ip = IP_HL(ip)*4;
  if (size_ip < 20) {
    //printf("   * Invalid IP header length: %u bytes\n", size_ip);
    return;
  }

  /* define/compute tcp header offset */
  tcp = (struct sniff_tcp*)(packet + SIZE_ETHERNET + size_ip);
  size_tcp = TH_OFF(tcp)*4;

  /* compute tcp payload (segment) size */
  size_payload = ntohs(ip->ip_len) - (size_ip + size_tcp);

  return;
}

Thank you for your help.

Answer 1

What was the CPU usage? Was it 15% of a single core or 15% of all cores? If it was 15% of all cores, and you have 8 cores, it is actually over 100% of a single core. So, this could explain then why your single-threaded application fails to capture all packets.

If you are unable to receive all packets using the pcap library, there is really no other way than to try using another packet reception mechanism. Linux has PF_PACKET sockets which could possibly help in your situation. According to this answer: libpcap or PF_PACKET? ...libpcap should be preferred over PF_PACKET as libpcap is more portable and uses internally the memory-mapped mechanism of PF_PACKET which is tricky to use.

According to the answer, libpcap uses the memory-mapped mechanism of PF_PACKET. You could try using PF_PACKET manually in a non-memory-mapped mode so your packet access mechanism would be different then. If there's a bug somewhere in the memory-mapped mode, it may result in packet loss.

Have you tried recording the packet capture with tcpdump? Tcpdump internally uses libpcap, so if tcpdump is able to capture all packets and your software is unable to do so, it gives evidence that the bug is in your software and it is not an inherent limitation in libpcap.

Why can't our packet sniffer receive all of the replayed TCP packets?

Question

1 answers

solution1
1 ACCPTED 2015-03-20 08:54:53

Why can't our packet sniffer receive all of the replayed TCP packets?

Question

1 answers

solution1 1 ACCPTED 2015-03-20 08:54:53

solution1
1 ACCPTED 2015-03-20 08:54:53