从缓冲区读取记录

Question

I have a huge file to be read. 我有一个大文件要读取。 I have a IO thread which reads data ( 4mb ) from the disk and stores in a circular array of 6 elements ( 4mb each ). 我有一个IO线程，该线程从磁盘读取数据（4mb）并存储在6个元素的圆形数组中（每个4mb）。 I have another thread which reads from the circular buffer to convert the data into a some records. 我有另一个线程从循环缓冲区中读取数据，以将数据转换为一些记录。

The problem is I can have records which spans across 2 different buffers ( ie. say a record can start from the end of 1st buffer and extend upto start of next buffer ) 问题是我可以拥有跨越2个不同缓冲区的记录（即一条记录可以从第一个缓冲区的末尾开始，一直延伸到下一个缓冲区的末尾）。

How do I handle such cases ? 我该如何处理这种情况？

Could you point to some sample implementation ? 您能否指出一些示例实现？

Answer 1

Your function for reading from a buffer should read from the next buffer when a record spans two buffers. 当一条记录跨越两个缓冲区时，用于从缓冲区读取的函数应从下一个缓冲区读取。

More precisely, create a function that assembles a record from data in the buffer. 更准确地说，创建一个从缓冲区中的数据汇编记录的函数。 If the data pointer hits the end of a buffer before the record is finished, set the data pointer to the beginning of the next buffer. 如果数据指针在记录结束之前到达缓冲区的末尾，请将数据指针设置为下一个缓冲区的开始。

Hmmm, looks like this can be applied more generically. 嗯，看起来可以更通用地应用。 Build items by reading from a data pointer. 通过从数据指针读取来构建项目。 Before the data pointer is accessed, check for end of buffer. 在访问数据指针之前，请检查缓冲区的末尾。 If the pointer is past the end of the buffer, set it to the beginning of the next buffer. 如果指针超出缓冲区的末尾，请将其设置为下一个缓冲区的开始。 This concept is very similar to buffered I/O. 这个概念与缓冲I / O非常相似。 Hmmm, perhaps you can modify the iostreams or create your own, that will fetch data from your buffers instead of cin or a file. 嗯，也许您可以修改iostream或创建自己的iostream，它们将从缓冲区而不是cin或文件中获取数据。 Look at std::istringstream . 看一下std::istringstream 。

Answer 2

You should split your record reading process in to steps : 您应该将记录读取过程分为以下步骤：

convert your buffers chain into an input stream 将您的缓冲区链转换为输入流
parse the input stream to produce record 解析输入流以产生记录

You can use standard classes to achieve the first step as Thomas said or implement your own solution. 您可以使用标准类来实现Thomas所说的第一步，也可以实现自己的解决方案。 A trivial solution may look like this (assuming fixed size for records) 一个简单的解决方案可能看起来像这样（假设记录的大小固定）

class BufferReader{
...
public :
   // this function will read data from buffers. 
   // size of readed data is arbitrary and does not depend on buffer size
   // it will return -1 when eof reached, readed size in other case 
   int readData(char *data, int length);
...
}

Then you can parse your records : 然后，您可以解析记录：

int size = /* size of the record */;
BufferReader br(/* some construction parameters here */)
char data[size];
while(br.readData(data, size) == size){
   // parse your data to fill your record
...

从缓冲区读取记录

问题描述

2 个解决方案

解决方案1
0 2013-03-05 00:51:22

解决方案2
0 2013-03-05 10:11:06

从缓冲区读取记录

问题描述

2 个解决方案

解决方案1 0 2013-03-05 00:51:22

解决方案2 0 2013-03-05 10:11:06

解决方案1
0 2013-03-05 00:51:22

解决方案2
0 2013-03-05 10:11:06