内存将由空格分隔的浮点文本文件映射到浮点数的向量

Question

I have a text file with several thousands of floating point numbers separated by a space.I was trying to use memory mapping to copy all those data from the text file to a vector of floating point numbers in visual studio 2010 using c++.Following is the code i made to read text file to memory.Code is just reading random numbers which makes no sense. 我有一个文本文件，其中有数千个由空格分隔的浮点数。我试图使用内存映射将所有这些数据从文本文件复制到Visual Studio 2010中使用c ++的浮点数向量。以下是代码我用来读取文本文件到memory.Code只是读取随机数，这是没有意义的。 Can any one help me fixing it and copying the data to a vector of floats 任何人都可以帮我修复它并将数据复制到浮点数向量中

#include<boost\iostreams\device\mapped_file.hpp>
#include<iostream>
int main()
{
boost::iostreams::mapped_file_source file;
int numberofElements = 1000000;
int numberofBytes = numberofElements*sizeof(float);
file.open("ReplayTextFile.txt",numberofBytes);

if(file.is_open())
{
float* data = (float*)file.data();
for(int i = 0;i <numberofElements;i++)
std::cout<<data[i]<<", ";

file.close();
} else
{
std::cout<<std::cout<<" couldnt map the file"<<std::endl;
}
system("pause");
return 0;
}

Answer 1

This is basically looking at the underlying representation of the input text, taking sizeof(float) bytes of that and attempting to treat that as an actual float . 这基本上是查看输入文本的基础表示，获取其中的sizeof(float)字节并尝试将其视为实际的float 。

In a typical case, a float will be four bytes, so given input like 1.23456 , it'll take 1.23 , look at the underlying representation (typically 0x31, 0x23, 0x32, 0x33) and interpret that much as a float . 在典型的情况下， float将是4个字节，因此给定输入如1.23456 ，它将需要1.23 ，查看基础表示（通常为0x31,0x23,0x32,0x33）并将其解释为float 。

Then it'll take 456 (0x34, 0x35, 0x36, 0x20) and interpret that as a second float . 然后它将需要456 （0x34,0x35,0x36,0x20）并将其解释为第二个float 。

Obviously, you need to convert the digits of one number into one float, ignore the space, then convert the digits of the next number into the next float . 显然，您需要将一个数字的数字转换为一个浮点数，忽略空格，然后将下一个数字的数字转换为下一个float 。

The easiest way to do that would be to open the file as a stream, then initialize a vector<float> from istream_iterator s initialized from the file: 最简单的方法是将文件作为流打开，然后从文件初始化的istream_iterator初始化vector<float> ：

std::ifstream in("ReplayTextFile.txt");

std::vector<float> floats { std::istream_iterator<float>(in),
                            std::istream_iterator<float>()};

Answer 2

First, a terminology quibble: you're not copying the data from the file into a vector of floats, you're memory-mapping the data into an array of floats. 首先，术语狡辩：你不是将文件中的数据复制到浮点数向量中，而是将数据内存映射到浮点数组中。

Second, when you memory map a file, the contents of memory are literally the same as the contents of the file on disk. 其次，当您对内存映射文件时，内存的内容实际上与磁盘上文件的内容相同。 So if a file had the number 2.203 and nothing else inside of it, when you memory mapped it and tried to read element 0 as a float, you would be reading (assuming sizeof(float)==4 ) the bytes (in hex) 32 2e 32 30 . 因此，如果一个文件的编号为2.203并且其中没有其他内容，当你的内存映射它并尝试将元素0作为浮点数读取时，你将读取（假设sizeof(float)==4 ）字节（十六进制） 32 2e 32 30 。 These would be interpreted as a float, which is not what you want. 这些将被解释为浮点数，这不是你想要的。

Instead, you need to at some point process the input and convert string representations into the bytes that represent that number as floating point. 相反，您需要在某个时刻处理输入并将字符串表示转换为表示该数字作为浮点的字节。 You can do that by opening the file with ifstream and then using the >> operator to read into a float. 您可以通过使用ifstream打开文件然后使用>>运算符读取浮点数来实现。

However, if you want the runtime efficiency that comes with memory-mapping a file, you likely don't want to parse floats every time you run your program. 但是，如果您希望内存映射文件所带来的运行时效率，则可能不希望每次运行程序时都解析浮点数。 In that case, you need to first preprocess the file to convert it from a series of numbers as strings to instead contain the raw bytes that represent the floating point numbers you want. 在这种情况下，您需要首先预处理文件，将其从一系列数字转换为字符串，而不是包含表示所需浮点数的原始字节。

In my code, I've used the function below to write out bytes into an ostream opened with ios_base::binary . 在我的代码中，我使用下面的函数将字节写入使用ios_base::binary打开的ostream 。

void writeFloat(std::ostream &out, float f) {
    char *pt = reinterpret_cast<char*>(&f);
    out.put(pt[0]);
    out.put(pt[1]);
    out.put(pt[2]);
    out.put(pt[3]);
}

Once you've prepared the file, you should be able to memory map and read data from it as your code is already. 准备好文件后，您应该能够存储映射并从中读取数据，因为代码已经存在。

内存将由空格分隔的浮点文本文件映射到浮点数的向量

问题描述

2 个解决方案

解决方案1
2 2014-05-29 21:40:26

解决方案2
1 2014-05-29 21:37:06

内存将由空格分隔的浮点文本文件映射到浮点数的向量

问题描述

2 个解决方案

解决方案1 2 2014-05-29 21:40:26

解决方案2 1 2014-05-29 21:37:06

解决方案1
2 2014-05-29 21:40:26

解决方案2
1 2014-05-29 21:37:06