性能比較 - pcap文件讀取：C ++的ifstream VS C的fread

Question

我正在研究哪一個是更快的二進制文件閱讀器：C ++的ifstream :: read或C的fread。

根據互聯網， 包括類似的問題 ，沒有太大的區別，所以我決定挖杓子。

我使用了一個1.22gb的pcap文件，其中包含大約1,377,000個數據包。 兩個程序都使用mingw32-g ++編譯，沒有優化。

header structs是根據wireshark的wiki定義的 - libpcap文件結構： https ： //wiki.wireshark.org/Development/LibpcapFileFormat

這是C代碼：

#include <stdio.h>
#include <stdlib.h>
#include <Winsock2.h>

/* definition of structs: pcap_global_header, pcap_packet_header, ethernet_header, ipv4_header, tcp_header */

int main()
{
    int count = 0, bytes_read;

    /* open file */
    FILE * file = fopen("test.pcap", "rb");

    /* read file header */
    struct pcap_global_header gheader;

    fread(&gheader, sizeof(char), sizeof(struct pcap_global_header), file);

    // if not ethernet type
    if(gheader.network != 1)
    {
        printf("not ethernet !\n");
        return 1;
    }

    /* read packets */
    char *buffer = (char*)malloc(gheader.snaplen);

    struct pcap_packet_header pheader;
    struct ether_header eth;
    struct ipv4_header ip;
    struct tcp_header tcp;

    fread(&pheader, sizeof(char), sizeof(struct pcap_packet_header), file);

    while(!feof(file))
    {
        ++count;

        bytes_read = fread(&eth, sizeof(char), sizeof(struct ether_header), file);

        // ip
        if(eth.type == 0x08)
        {
            bytes_read += fread(&ip, sizeof(char), sizeof(struct ipv4_header), file);

            //tcp
            if( ip.protocol == 0x06 )
            {
                bytes_read += fread(&tcp, sizeof(char), sizeof(struct tcp_header), file);
            }
        }

        //read rest of the packet
        fread(buffer, sizeof(char), pheader.incl_len - bytes_read, file);

        // read next packet's header
        fread(&pheader, sizeof(char), sizeof(struct pcap_packet_header), file);
    }

    printf("(C) total packets: %d\n", count);

    return 0;
}

這是C ++代碼：

#include <iostream>
#include <fstream>
#include <memory>

#include <Winsock2.h>

/* definition of structs: pcap_global_header, pcap_packet_header, ethernet_header, ipv4_header, tcp_header */

int main()
{
    int count_packets = 0, bytes_read;

    /* open file */
    std::ifstream file("test.pcap", std::fstream::binary | std::fstream::in);

    /* read file header */
    struct pcap_global_header gheader;

    file.read((char*)&gheader, sizeof(struct pcap_global_header));

    // if not ethernet type
    if(gheader.network != 1)
    {
        printf("not ethernet !\n");
        return 1;
    }

    /* read packets */
    char *buffer = std::allocator<char>().allocate(gheader.snaplen);

    struct pcap_packet_header pheader;
    struct ether_header eth;
    struct ipv4_header ip;
    struct tcp_header tcp;

    file.read((char*)&pheader, sizeof(pcap_packet_header));

    while(!file.eof())
    {
        ++count_packets;

        file.read((char*)&eth, sizeof(struct ether_header));
        bytes_read = sizeof(struct ether_header);

        // ip
        if(eth.type == 0x08)
        {
            file.read((char*)&ip, sizeof(struct ipv4_header));
            bytes_read += sizeof(struct ipv4_header);

            //tcp
            if( ip.protocol == 0x06 )
            {
                file.read((char*)&tcp, sizeof(struct tcp_header));
                bytes_read += sizeof(struct tcp_header);
            }
        }

        // read rest of the packet
        file.read(buffer, pheader.incl_len - bytes_read);

        // read next packet's header
        file.read((char*)&pheader, sizeof(pcap_packet_header));
    }

    std::cout << "(C++) total packets :" << count_packets << std::endl;

    return 0;
}

結果非常令人失望：

C代碼結果：

(C) total packets: 1377065

Process returned 0 (0x0)   execution time : 1.031 s
Press any key to continue.

C ++代碼結果：

(C++) total packets :1377065

Process returned 0 (0x0)   execution time : 3.172 s
Press any key to continue.

顯然，我運行了幾次每個版本，所以，我正在尋找一種更快的方式來使用C ++讀取文件。

Answer 1

ifstream::read()數據從內部緩沖區復制到緩沖區。 它導致性能的主要差異。 您可以嘗試克服它並使用您自己的pubsetbuf替換內部緩沖區：

std::ifstream file;
char buf[1024];
file.rdbuf()->pubsetbuf(buf, sizeof buf);

問題是此函數是實現定義的，在大多數情況下，您仍然需要使用額外的數據副本。

在你的情況下，你不需要ifstream所有功能，所以為了性能和簡單性，我建議使用<cstdio> 。

Answer 2

fread()應該總是更快，因為它直接將字節讀入緩沖區而無需額外處理（這里不需要）。

此外，最好一次讀取整個數據包，而不是每個數據包調用fread() 4次。 然后，您可以在緩沖區上使用ether_header* 。

使用mmap()而不是fread()可以為您提供額外的加速（無需將數據從內核模式復制到用戶模式緩沖區）。 對於Windows，請參閱CreateFileMapping()和MapViewOfFile() - 這允許您直接使用指針訪問文件內容，就像它是一個大內存緩沖區一樣。

性能比較 - pcap文件讀取：C ++的ifstream VS C的fread

問題描述

2 個解決方案

解決方案1
2 已采納 2016-11-18 11:04:42

解決方案2
1 2016-11-19 16:27:36

性能比較 - pcap文件讀取：C ++的ifstream VS C的fread

問題描述

2 個解決方案

解決方案1 2 已采納 2016-11-18 11:04:42

解決方案2 1 2016-11-19 16:27:36

解決方案1
2 已采納 2016-11-18 11:04:42

解決方案2
1 2016-11-19 16:27:36