简体   繁体   中英

Ethernet throughput drop in difference method of allocating buffer

We had tried following approach on server code and we had a very bad throughput results. Is it the coding style and memory allocation can impact the throughput performance and how do we tune it? I need to achieve at least 55% above. I test my throughput using wireshark and i do some of the setting: server

   ethtool -K eth0 tx off
   ifdown eth0
   ifconfig eth0 mtu 3500
   ifup eth0
   taskset -c 0 ./server

client

  ifconfig eth0 mtu 9000
  1. Array a. Without populate any data.

     unsigned long bytetosent = 4080*65536; char sendBuff[4080 * 65536]; while(x < bytetosent) { int bytesWritten = send(connfd, (const char*)sendBuff+x, 6144, 0); x += 6144; } 

Throughput: 73%

b. populate with random data

    for(int y=0; y<(4080); y++)
    {
      sendBuff[y] = (rand() %255);
    }
    while(x < bytetosent)               
    {
       int bytesWritten = send(connfd, (const char*)sendBuff+x, 6144, 0);
       x += 6144;
    }

Throughput: 53%

  1. mmap

     pImagePool = mmap( (void *)DDR_RAM_PHYS,MAPPED_SIZE_BUFFER, PROT_READ, MAP_SHARED, _fdMem, 0); while(x < bytetosent) { bytesWritten =send(connfd, ((const char*)(pImagePool))+ x, 6144, 0); x += 6144; } 

Throughput: 20%

  1. malloc

     char *_mBuffer; _mBuffer = malloc(4080 * 65536); 

a. without populate any data

    while(x < bytetosent)
    {
        bytesWritten =send(connfd, (const char*)_mBuffer+x, 6144, 0);
        x += 6144;
    }

Throughput: 73%

b. populate random data Throughput: 50%

  1. vector

     vector<char>_pBuffer; 

    a. copy

     copy((char*)(pImagePool), (((char*)(pImagePool))+1000000), back_inserter(_pBuffer)); bytesWritten = send(connfd, &_pBuffer[bytesSent], bytetosent, MSG_CONFIRM); bytesSent += bytesWritten; 

Throughput: very slow

b. push_back

   for(int y=0; y<bytetosent; y++)
   {
     _pBuffer.push_back(*((char*)pImagePool+y));                   
   }

Throughput: very very slow

My server connection setup code:

    int listenfd = 0, connfd = 0;
    struct sockaddr_in serv_addr;
    time_t ticks;
    listenfd = socket(AF_INET, SOCK_STREAM, 0);
    memset(&serv_addr, '0', sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    serv_addr.sin_addr.s_addr = htonl(INADDR_ANY);
    serv_addr.sin_port = htons(8080);
    bind(listenfd, (struct sockaddr*)&serv_addr, sizeof(serv_addr));
    listen(listenfd, 10);
    connfd = accept(listenfd, (struct sockaddr*)NULL, NULL);

My client recieve code:

   char recvBuff[4080 * 65536];
   while ( (n = read(sockfd, recvBuff, sizeof(recvBuff)-1)) > 0)
   {
     recvBuff[n] = 0;
        //printf("\n in while\n");
     if(0)//fputs(recvBuff, stdout) == EOF)
     {
       printf("\n Error : Fputs error\n");
     }
   }

I tried to do the time profiling using mmap and memcpy. The resut: It consider fast?

Total Byte: 40000 * 1088 Time Taken: 0.107197s

The code:

    void *pImagePool;
    char _cpy[40000 * 1088];
    /*Memory Mapping */
    const char imagePoolDevice[]="/dev/mem";
    int _fdMem;
    if( (_fdMem = open( imagePoolDevice, O_RDWR | O_SYNC )) < 0 )
    {
     printf("Unable to open /dev/mem %d\n", _fdMem);
    }
    //start
    start = chrono::system_clock::now();
    pImagePool = mmap(0,MAPPED_SIZE_BUFFER, PROT_READ|PROT_WRITE, MAP_SHARED, _fdMem, DDR_RAM_PHYS);
   if( pImagePool == MAP_FAILED ){printf("Mapping Failed\n");}
   else{printf("Successful Mapping\n");}
    memcpy ( _cpy, &pImagePool, sizeof(_cpy));
    //end
    end = chrono::system_clock::now();
    chrono::duration<double>elapsed_seconds = end - start;
    time_t end_time = chrono::system_clock::to_time_t(end);
    cout<<"mmap time taken = "<<elapsed_seconds.count()<<endl;

I finally overcome the problem by blind testing, I test with 2 approach below:

  1. I create a array and memcpy the mmap return pointer to the array. It increase the ethernet throughput up to 55%. Code shown below:

     char _cpy[40000 * 1088]; pImagePool = mmap(0,MAPPED_SIZE_BUFFER, PROT_READ|PROT_WRITE, MAP_SHARED, _fdMem, DDR_RAM_PHYS); memcpy ( _cpy, pImagePool, 40000 * 1088); while(bytesSent != bytetosent) { bytesWritten = send(connfd, &_cpy[bytesSent], bytetosent, MSG_CONFIRM); bytesSent += bytesWritten; } 
  2. copy mmap return pointer to vector but need to reserve the vector size first.I am sorry i didnt do it properly at my previous test code. The ethernet throughput increase up to 51%.

     vector<char>_pBuffer; pImagePool = mmap(0,MAPPED_SIZE_BUFFER, PROT_READ|PROT_WRITE, MAP_SHARED, _fdMem, DDR_RAM_PHYS); _pBuffer.reserve(43520000); //40000 * 1088 copy((char*)pImagePool, (char*)pImagePool + 43520000, back_inserter(_pBuffer)); while(bytesSent != bytetosent) { bytesWritten = send(connfd, &_pBuffer[bytesSent], bytetosent, MSG_CONFIRM); bytesSent += bytesWritten; } 

I dont know how this method works, I just assume it is copy to memory cache.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM