简体   繁体   English

从二进制文件读取并转换为两倍?

[英]Reading from binary file and converting to double?

I am trying to write a C program that reads a binary file and converts it to a data type. 我试图编写一个读取二进制文件并将其转换为数据类型的C程序。 I am generating a binary file with a head command head -c 40000 /dev/urandom > data40.bin . 我正在使用头命令head -c 40000 /dev/urandom > data40.bin生成一个二进制文件。 The program works for data types int and char but fails for double. 该程序适用于int和char数据类型,但不能执行double。 Here is the code for the program. 这是程序的代码。

void double_funct(int readFrom, int writeTo){
    double buffer[150];
    int a = read(readFrom,buffer,sizeof(double));
    while(a!=0){
        int size = 1;
        int c=0;

         for(c=0;c<size;c++){
            char temp[100];
            int x = snprintf(temp,100,"%f ", buffer[c]);
            write(writeTo, temp, x);
        }
        a = read(readFrom,buffer,sizeof(double));
    }
}

and this is the char function that works 这是有效的char函数

void char_funct(int readFrom, int writeTo){
    char buffer[150];
    int a = read(readFrom,buffer,sizeof(char));
    while(a!=0){
        int size = 1;
        int c=0;

        for(c=0;c<size;c++){
            char temp[100]=" ";
            snprintf(temp,100,"%d ", buffer[c]);
            write(writeTo, temp, strlen(temp));
        }
        a = read(readFrom,buffer,sizeof(char));
    }
}

The problem is that with char I need to get 40000 words with wc -w file and I get them. 问题是使用char我需要用wc -w file获取40000个单词,然后我将它们获取。 Now with double I get random amount of words but theoretically I should get 5000 from 40000 bytes of data but I get a random amount between 4000 and 15000 and for char I get 40000 like it should 1 byte for one character. 现在有了double我得到了随机的单词数量,但是理论上我应该从40000字节的数据中得到5000,但是我得到4000到15000之间的随机数,对于char我得到40000,就像它应该为一个字符提供1个字节。

I don't know what is wrong the same code works for int where I get 10000 words from 40000 bytes of data. 我不知道这是什么错误,相同的代码适用于int ,我从40000字节的数据中获得10000个单词。

The main problem seems to be that your temp array is not large enough for your printf format and data. 主要问题似乎是您的temp数组不足以容纳您的printf格式和数据。 IEEE-754 double s have a decimal exponent range from from -308 to +308. IEEE-754 double的十进制指数范围为-308至+308。 You're printing your doubles with format "%f" , which produces a plain decimal representation. 您正在以"%f"格式打印双打,这将产生纯十进制表示形式。 Since no precision is specified, the default precision of 6 applies. 由于未指定精度,因此默认精度为6。 This may require as many as 1 (sign) + 309 (digits) + 1 (decimal point) + 6 (trailing decimal places) + 1 (terminator) chars (a total of 318), but you only have space for 100. 这可能需要多达1(符号)+ 309(数字)+1(小数点)+6(尾随小数位)+1(终止符)字符(共318个),但是您只能容纳100个空间。

You print to your buffer using snprintf() , and therefore do not overrun the array bounds there, but snprintf() returns the number of bytes that would have been required , less the one required for the terminator. 您使用snprintf()打印到缓冲区,因此不会超出那里的数组范围,但是snprintf()返回的是所需的字节数,减去终止符所需的字节数。 That's the number of bytes you write() , and in many cases that does overrun your buffer. 那是您write()的字节数,并且在许多情况下确实会超出缓冲区。 You see the result in your output. 您会在输出中看到结果。

Secondarily, you'll also see a large number of 0.00000 in your output, arising from rounding small numbers to 6-decimal-digit precision. 其次,您还会在输出中看到大量的0.00000 ,这是因为将小数舍入到6进制的精度。

You would probably have better success if you change the format with which you're printing the numbers. 如果更改打印数字的格式,可能会获得更好的成功。 For example, "%.16e " will give you output in exponential format with a total of 17 significant digits (one preceding the decimal point). 例如, "%.16e "将为您提供指数格式的输出,总共有17个有效数字(小数点前一位)。 That will not require excessive space in memory or on disk, and it will accurately convey all numbers, regardless of scale, supposing again that your double s are represented per IEEE 754. If you wish, you can furthermore eliminate the (pretty safe) assumption of IEEE 754 format by employing the variation suggested by @chux in comments. 这将不需要在内存或磁盘上占用过多的空间,并且它将准确地传递所有数字,而不管其规模如何,再次假设您的double是按照IEEE 754表示的。如果您愿意,还可以消除(非常安全的)假设通过使用@chux在注释中建议的变体来更改IEEE 754格式。 That would be the safest approach. 那将是最安全的方法。

One more thing: IEEE floating point supports infinities and multiple not-a-number values. 还有一件事:IEEE浮点支持无限性和多个非数字值。 These are very few in number relative to ordinary FP numbers, but it is still possible that you'll occasionally hit on one of these. 与普通FP编号相比,这些编号的数量很少,但是您偶尔还是会碰到其中之一。 They'll probably be converted to output just fine, but you may want to consider whether you need to deal specially with them. 它们可能会被转换为输出,但是您可能要考虑是否需要专门处理它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM