增量读取二进制文件

Question

I recently made another question about parsing binary file, and I sort of got it to work thanks to everyone here. 最近，我对解析二进制文件提出了另一个问题，感谢这里的每个人，我对此有所帮助。

https://stackoverflow.com/questions/37755225/need-help-reading-binary-file-to-a-structure?noredirect=1#comment62983158_37755225 https://stackoverflow.com/questions/37755225/need-help-reading-binary-file-to-a-structure?noredirect=1#comment62983158_37755225

But I now face a new challenge and I am in need of help. 但是我现在面临着新的挑战，需要帮助。

My binary file looks something like this but much much longer... 我的二进制文件看起来像这样，但是要长得多……

ST.........¸.°Ý.ø...0.œ...........ESZ4 1975..........IYH.testDBDBDBDBST...........°Ý.ø...................DBDBDBDBST.........P.´Ý.ø...0.œ...........ESZ4 1975..........HTC.testDBDBDBDBST.........‹‚´Ý.ø...................DBDBDBDBST.........ƒD.Þ.ø...0.œ...........ESZ4 1975..........ARM.testDBDBDBDBST.........«E.Þ.ø...................DBDBDBDB ST .........¸。°Ý.ø...0.œ.............. ESZ4 1975 .......... IYH.testDBDBDBDBDBST .. .........°Ý.ø...................... DBDBDBDBST ......... P.´Ý.ø .. .0 ............ ESZ4 1975 ............. HTC.testDBDBDBDBST ......... ‹，，。ø..... .............. DBDBDBDBST .........ƒD.Þ.ø...0.œ........... ESZ4 1975 .. ........ ARM.testDBDBDBDBST .........«E.Þ.ø.......... DBDBDBDB

Basically, every message starts with 'ST' and ends with 'DBDBDBDB'. 基本上，每个消息都以“ ST”开头，以“ DBDBDBDB”结尾。 The goal is to parse each message and store the message into a data structure. 目标是解析每个消息并将消息存储到数据结构中。 In addition, every message is different depending on the type, and different type of message will have additional members. 此外，每个消息都取决于类型，并且不同类型的消息将具有其他成员。

The problem I am having is, I have no idea how to iterate through this binary file... If its a regular file, I can just do while(getline(file,s)), but what about binary file?? 我遇到的问题是，我不知道如何遍历此二进制文件...如果它是常规文件，我可以执行while（getline（file，s）），但是二进制文件呢？ Is there a way to say, find the first "ST" and "DBDBDBDB" and parse the middle stuff, then move on to the next set of ST and DB? 有没有办法说，找到第一个“ ST”和“ DBDBDBDB”并解析中间的内容，然后转到下一组ST和DB？ Or somehow read the file incrementally keeping track of where I am? 还是以某种方式逐步读取文件以跟踪我的位置？

I apologise ahead of time for posting so much code. 我为发布这么多代码提前表示歉意。

#pragma pack(push, 1)
struct Header
{
    uint16_t marker;
    uint8_t msg_type;
    uint64_t sequence_id;
    uint64_t timestamp;
    uint8_t msg_direction;
    uint16_t msg_len;
};
#pragma pack(pop)
struct OrderEntryMessage
{
    Header header;
    uint64_t price;
    uint32_t qty;
    char instrument[10];
    uint8_t side;
    uint64_t client_assigned_id;
    uint8_t time_in_force;
    char trader_tag[3];
    uint8_t firm_id;
    char firm[256] ; 
    char termination_string[8]; 
};

struct AcknowledgementMessage
{
    Header header;
    uint32_t order_id;
    uint64_t client_id;
    uint8_t order_status;
    uint8_t reject_code;
    char termination_string[8];
};

struct OrderFillMessage
{
    Header header;
    uint32_t order_id;
    uint64_t fill_price;
    uint32_t fill_qty;
    uint8_t no_of_contras;
    uint8_t firm_id;
    char trader_tag[3];
    uint32_t qty;
    char termination_string[8]; 
};
void TradeDecoder::createMessage()
{
    ifstream file("example_data_file.bin", std::ios::binary);

     //I want to somehow Loop here to keep looking for headers ST

    Header h;
    file.read ((char*)&h.marker, sizeof(h.marker));
    file.read ((char*)&h.msg_type, sizeof(h.msg_type));
    file.read ((char*)&h.sequence_id, sizeof(h.sequence_id));
    file.read ((char*)&h.timestamp, sizeof(h.timestamp));
    file.read ((char*)&h.msg_direction, sizeof(h.msg_direction));
    file.read ((char*)&h.msg_len, sizeof(h.msg_len));
    file.close();

    switch(h.sequence_id)
    {
        case 1: 
            createOrderEntryMessage(h); //this methods creates a OrderEntryMessage with the header
        break;
        case 2:
            createOrderAckMessage(h); //same as above
        break;
        case 3:
            createOrderFillMessage(h); //same as above
        break;
        default:
        break;
    }
}

Much much thanks..... 非常感谢.....

Answer 1

You can read the whole file to the buffer and then parse the buffer according to your requirements. 您可以将整个文件读取到缓冲区，然后根据需要解析缓冲区。

I would use 我会用

fread

to read the whole file to the buffer and then process/parse the buffer byte by byte. 将整个文件读取到缓冲区，然后逐字节处理/解析缓冲区。

This is an example: 这是一个例子：

/* fread - read an entire file to a buffer */
#include <stdio.h>
#include <stdlib.h>

int main () {
  FILE * pFile;
  long lSize;
  char * buffer;
  size_t result;

  pFile = fopen ( "myfile.bin" , "rb" );
  if (pFile==NULL) 
     {fputs ("File error",stderr); exit (-1);}

  // obtain file size:
  fseek (pFile , 0 , SEEK_END);
  lSize = ftell (pFile);
  rewind (pFile);

  // allocate memory to contain the whole file:
  buffer = (char*) malloc (sizeof(char)*lSize);
  if (buffer == NULL) // malloc failed
     {fputs ("Memory error",stderr); exit (-2);}

  // copy the file into the buffer:
  result = fread (buffer,1,lSize,pFile);
  if (result != lSize)
     {fputs ("Reading error",stderr); exit (-3);}

  // the whole file is now loaded in the memory buffer. 
  // you can process the whole buffer now: 

  // for (int i=0; i<lSize;i++)
  // {
  //    processBufferByteByByte(buffer[i]);
  // }
  // or
  // processBuffer(buffer,lSize);

  // terminate
  fclose (pFile);
  free (buffer);
  return 0;
}

增量读取二进制文件

问题描述

1 个解决方案

解决方案1
0 2016-06-10 23:12:47

增量读取二进制文件

问题描述

1 个解决方案

解决方案1 0 2016-06-10 23:12:47

解决方案1
0 2016-06-10 23:12:47