简体   繁体   English

从二进制文件读取字节到long int

[英]reading bytes from binary file to long int

I have two question : 我有两个问题:

  • I have a data on binary file. 我有二进制文件的数据。 I want read first 8 bytes to signed long int by using read function but I could not . 我想通过使用read函数读取前8个字节来签名long int但我不能。 Do you know how can I do that ? 你知道我怎么做的吗?

  • How can directly read a block of data to string ? 如何直接读取数据块到字符串? Can I read like as shown in ex : 我可以像前面所示阅读:

      ifstream is; is.open ("test.txt", ios::binary ); string str ; is. read ( str.c_str, 40 ) ; // 40 bytes should be read 

I want read first 8 bytes to signed long int by using read function but I could not . 我想通过使用read函数读取前8个字节来签名long int但我不能。 Do you know how can I do that? 你知道我怎么做的吗?

Don't assume long is wide enough, it often isn't. 不要假设long的足够宽,它往往不是。 long long is guaranteed to be at least 8 bytes wide, though: 尽管如此, long long保证至少为8个字节宽:

long long x;
is.read(static_cast<char *>(&x), 8);

Mind you, this is still incredibly non-portable due to varying integer sizes and endiannesses. 请注意,由于整数大小和字节序的变化,这仍然是非常不便携的。

As for your second question, try 至于你的第二个问题,试试吧

char buf[41];
is.read(buf, 40);
// check for errors
buf[40] = '\0';

std::string str(buf);

or, safer 或者,更安全

char buf[41];
is.get(buf, sizeof(buf), '\0');
std::string str(buf);

I'm sure you mean 8 bytes into a 64-bit integer instead and there's a variety of ways to accomplish this. 我确定你的意思是将8个字节转换成64位整数,并且有多种方法可以实现这一点。 One way is to use a union : 一种方法是使用union

union char_long {
  char chars[8];
  uint64_t n;
};

// Extract 8 bytes and combine into a 64-bit number by using the
// internals of the union structure.
char_long rand_num;  
for(int i = 0; i < 8; i++) {
  rand_num.chars[i] = in.get(); // `in` is the istream.
}    

Now rand_num.n will have the integer stored so you can access it. 现在rand_num.n将存储整数,以便您可以访问它。

As for the second question. 至于第二个问题。 Read in the bytes and assign them to the string: 读入字节并将它们分配给字符串:

const int len = 5; // Some amount.
char *buf = new char[len];
ifstream in("/path/to/file", ios::binary);
in.read(buf, len);
string str;
str.assign(buf);
delete[] buf;

You could be concerned by portability of the code and the data: if you exchange binary files between various machines, the binary data will be read as garbage (eg because of endianness and word sizes differences). 您可能会担心代码和数据的可移植性:如果您在各种机器之间交换二进制文件,二进制数据将被读取为垃圾(例如,由于字节顺序和字大小差异)。 If you only read binary data on the same machine that has written it, it is ok. 如果您只在已编写二进制数据的同一台机器上读取二进制数据,则可以。

Another concern, especially when the data is huge and/or costly, is robustness with respect to evolution of your code base. 另一个问题,特别是当数据庞大和/或成本高昂时,就代码库的演变而言是健壮的。 For instance, if you read a binary structure, and if you had to change the type of one of its fields from int (or int32_t ) to long (or int64_t ) your binary data file is useless (unless you code specific conversion routines). 例如,如果您读取二进制结构,并且必须将其中一个字段的类型从int (或int32_t )更改为long (或int64_t ),则二进制数据文件将无用(除非您编写特定的转换例程)。 If the binary file was costly to produce (eg needs an experimental device, or a costly computation, to create it) you can be in trouble. 如果生成二进制文件的成本很高(例如需要实验设备,或者需要昂贵的计算,那么创建它)可能会遇到麻烦。

This is why structured textual formats (which are not a silver bullet, but are helpful) or data base management systems are used. 这就是使用结构化文本格式(不是灵丹妙药,但有用)或数据库管理系统的原因。 Structured textual formats include XML (which is quite complex), Json (which is very simple), and Yaml (complexity & power between those of XML and Json). 结构化文本格式包括XML (非常复杂), Json (非常简单)和Yaml (XML和Json之间的复杂性和功能)。 And textual formats are easier to debug (you could look at them in an editor). 文本格式更容易调试(您可以在编辑器中查看它们)。 There exist several free libraries to deal with these data formats. 有几个免费的库来处理这些数据格式。 Data bases are often more or less relational and Sql based. 数据库通常或多或少是关系型和基于Sql的。 There are several free DBMS software (eg PostGresQL or MySQL ). 有几个免费的DBMS软件(例如PostGresQLMySQL )。

Regards portability of binary files (between various machines) you could be interested by serialization techniques, formats ( XDR , Asn1 ) and libraries (like eg S11n and others). 考虑到二进制文件(在各种机器之间)的可移植性,您可能会对序列化技术,格式( XDRAsn1 )和库(例如S11n和其他)感兴趣。

If space or bandwidth is a concern, you might also consider compressing your textual data. 如果需要考虑空间或带宽,您还可以考虑压缩文本数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM