[英]c++ read map in binary file which created in python
I created a Python script which creates the following map (illustration):我创建了一个 Python 脚本,它创建了以下地图(插图):
map<uint32_t, string> tempMap = {{2,"xx"}, {200, "yy"}};
and saved it as map.out file (a binary file).并将其保存为 map.out 文件(二进制文件)。 When I try to read the binary file from C++, it doesn't copy the map, why?
当我尝试从 C++ 读取二进制文件时,它没有复制映射,为什么?
map<uint32_t, string> tempMap;
ifstream readFile;
std::streamsize length;
readFile.open("somePath\\map.out", ios::binary | ios::in);
if (readFile)
{
readFile.ignore( std::numeric_limits<std::streamsize>::max() );
length = readFile.gcount();
readFile.clear(); // Since ignore will have set eof.
readFile.seekg( 0, std::ios_base::beg );
readFile.read((char *)&tempMap,length);
for(auto &it: tempMap)
{
/* cout<<("%u, %s",it.first, it.second.c_str()); ->Prints map*/
}
}
readFile.close();
readFile.clear();
It's not possible to read in raw bytes and have it construct a map
(or most containers, for that matter) [1] ;不可能读取原始字节并让它构造一个
map
(或大多数容器,就此而言) [1] ; and so you will have to write some code to perform proper serialization instead.因此您将不得不编写一些代码来执行正确的序列化。
If the data being stored/loaded is simple, as per your example, then you can easily devise a scheme for how this might be serialized, and then write the code to load it.如果存储/加载的数据很简单,根据您的示例,那么您可以轻松设计一个如何序列化的方案,然后编写代码来加载它。 For example, a simple plaintext mapping can be established by writing the file with each member after a newline:
例如,可以通过在换行符后写入每个成员的文件来建立简单的明文映射:
<number>
<string>
...
So for your example of:所以对于你的例子:
std::map<std::uint32_t, std::string> tempMap = {{2,"xx"}, {200, "yy"}};
this could be encoded as:这可以编码为:
2
xx
200
yy
In which case the code to deserialize this would simply read each value 1-by-1 and reconstruct the map
:在这种情况下,反序列化的代码将简单地 1×1 读取每个值并重建
map
:
// Note: untested
auto loadMap(const std::filesystem::path& path) -> std::map<std::uint32_t, std::string>
{
auto result = std::map<std::uint32_t, std::string>{};
auto file = std::ifstream{path};
while (true) {
auto key = std::uint32_t{};
auto value = std::string{};
if (!(file >> key)) { break; }
if (!std::getline(file, value)) { break; }
result[key] = std::move(value);
}
return result;
}
Note: For this to work, you need your python program to output the format that will be read from your C++ program.注意:为此,您需要您的 Python 程序输出将从您的 C++ 程序中读取的格式。
If the data you are trying to read/write is sufficiently complicated, you may look into different serialization interchange formats.如果您尝试读取/写入的数据足够复杂,您可以查看不同的序列化交换格式。 Since you're working between python and C++, you'll need to look into libraries that support both.
由于您在 python 和 C++ 之间工作,因此您需要查看支持两者的库。 For a list of recommendations, see the answers to Cross-platform and language (de)serialization
有关建议列表,请参阅跨平台和语言(反)序列化的答案
[1] The reason you can't just read (or write) the whole container as bytes and have it work is because data in containers isn't stored inline. [1]不能将整个容器作为字节读取(或写入)并让它工作的原因是因为容器中的数据不是内联存储的。 Writing the raw bytes out won't produce something like
2 xx\\n200 yy\\n
automatically for you.写出原始字节不会自动为您生成
2 xx\\n200 yy\\n
的东西。 Instead, you'll be writing the raw addresses of pointers to indirect data structures such as the map's internal node objects.相反,您将写入指向间接数据结构(例如映射的内部节点对象)的指针的原始地址。
For example, a hypothetical map
implementation might contain a node like:例如,一个假设的
map
实现可能包含一个节点,如:
template <typename Key, typename Value>
struct map_node
{
Key key;
Value value;
map_node* left;
map_node* right;
};
(The real map
implementation is much more complicated than this, but this is a simplified representation) (真正的
map
实现比这个复杂很多,但这是一个简化的表示)
If map<Key,Value>
contains a map_node<Key,Value>
member, then writing this out in binary will write the binary representation of key
, value
, left
, and right
-- the latter of which are pointers.如果
map<Key,Value>
包含一个map_node<Key,Value>
成员,那么用二进制写出它会写出key
、 value
、 left
和right
的二进制表示——后者是指针。 The same is true with any container that uses indirection of any kind;任何使用任何类型间接的容器也是如此; the addresses will fundamentally differ between the time they are written and read, since they depend on the state of the program at any given time.
地址在它们被写入和读取的时间之间会有根本的不同,因为它们取决于程序在任何给定时间的状态。
You can write a simple map_node
to test this, and just print out the bytes to see what it produces;您可以编写一个简单的
map_node
来测试它,只需打印出字节即可查看它产生的结果; the pointer will be serialized as well.指针也将被序列化。 Behaviorally, this is the exact same as what you are trying to do with a
map
and reading from a binary file.从行为上讲,这与您尝试使用
map
和从二进制文件中读取的操作完全相同。 See the below example which includes different addresses.请参阅以下示例,其中包含不同的地址。
You can use protocol buffers to serialize your map in python and deserialization can be performed in C++.您可以使用协议缓冲区在 python 中序列化您的地图,反序列化可以在 C++ 中执行。
Protocol buffers supports both Python and C++.协议缓冲区支持 Python 和 C++。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.