简体   繁体   English

来自C程序的Bizzare行为:: Kernighan和Ritchie练习2-3

[英]Bizzare behavior from C program:: Kernighan & Ritchie exercise 2-3

all. 所有。

I've written a program as a solution to Kernighan & Ritchie's exercise 2-3, and its behaviour during testing is (IMHO) wildly unintuitive. 我已经编写了一个程序来解决Kernighan&Ritchie的练习2-3,其在测试期间的行为(IMHO)非常不直观。

The problem spec says to write a program that converts hex values to their decimal equivalent. 问题规范要求编写一个将十六进制值转换为十进制等效值的程序。 The code I've written works fine for smaller hex values, but for larger hex values things get a little... odd. 我编写的代码对于较小的十六进制值可以很好地工作,但是对于较大的十六进制值,事情会变得有些奇怪。 For example, if I input 0x1234 the decimal value 4660 pops out on the other end, which happens to be the correct output (the code also works for letters, ie 0x1FC -> 508 ). 例如,如果我输入0x1234 ,则在另一端弹出十进制值4660 ,这恰好是正确的输出(该代码也适用于字母,即0x1FC- > 508 )。 If, on the other hand, I were to input a large hex value, say as a specific example 0x123456789ABCDEF , I should get 81985529216486895 , though instead I get 81985529216486896 (off by one digit!). 另一方面,如果我要输入一个较大的十六进制值,例如0x123456789ABCDEF ,则应该 输入81985529216486895 ,尽管我输入的是81985529216486896相差一个数字!)。

The error in conversion is inconsistent, sometimes with the decimal value being too high and other times too low. 转换错误不一致,有时十进制值过高而其他时候过低。 Generally, much larger hex values result in more incorrect place values in the decimal output. 通常,较大的十六进制值会导致十进制输出中的位数不正确。

Here's my program in its entirety: 这是我的完整程序:

/*Kernighan & Ritchie's Exercise 2-3

Write a function 'htoi' which converts a string of hexadecimal digits (including an 
optional 0x or 0X) into its equivalent integer value.
*/
#include <stdio.h>

#define MAXLINE 1000 //defines maximum size of a hex input

//FUNCTION DEFINITIONS
signed int htoi(char c); //converts a single hex digit to its decimal value

//BEGIN PROGRAM////////////////////////////////////////////////////////////
main()
{
   int i = 0; //counts the length of 'hex' at input
   char c; //character buffer
   char hex[MAXLINE]; //string from input
   int len = 0; //the final value of 'i'
   signed int val; //the decimal value of a character stored in 'hex'
   double n = 0; //the decimal value of 'hex'

   while((c = getchar()) != '\n') //store a string of characters in 'hex'
   {
      hex[i] = c;
      ++i;
   }
   len = i;
   hex[i] = '\0'; //turn 'hex' into a string

   if((hex[0] == '0') && ((hex[1] == 'x') || (hex[1] == 'X'))) //ignore leading '0x'
   {
      for(i = 2; i < len; ++i)
      {
        val = htoi(hex[i]); //call 'htoi'
        if(val == -1 ) //test for a non-hex character
        {
            break;
        }
        n = 16.0 * n + (double)val; //calculate decimal value of hex from hex[0]->hex[i]
      }
   }
   else
   {
      for(i = 0; i < len; ++i)
      {
          val = htoi(hex[i]); //call 'htoi'
          if(val == -1) //test for non-hex character
          {
             break;
          }
          n = 16.0 * n + (double)val; //calc decimal value of hex for hex[0]->hex[i]
      }
   }

 if(val == -1)
 {
    printf("\n!!THE STRING FROM INPUT WAS NOT A HEX VALUE!!\n");
 }
 else
 {
    printf("\n%s converts to %.0f\n", hex, n);
 }

 return 0;
 }

 //FUNCTION DEFINITIONS OUTSIDE OF MAIN()///////////////////////////////////
 signed int htoi(char c)
 {
   signed int val = -1;

   if(c >= '0' && c <= '9')
     val = c - '0';

   else if(c == 'a' || c == 'A')
     val = 10;

   else if(c == 'b' || c == 'B')
     val = 11;

   else if(c == 'c' || c == 'C')
     val = 12;

   else if(c == 'd' || c == 'D')
     val = 13;

   else if(c == 'e' || c == 'E')
     val = 14;

   else if(c == 'f' || c == 'F')
     val = 15;

   else 
   {
     ;//'c' was a non-hex character, do nothing and return -1
   }

   return val;
 }

pastebin: http://pastebin.com/LJFfwSN5 pastebin: http//pastebin.com/LJFfwSN5

Any ideas on what is going on here? 对这里发生的事情有什么想法吗?

You are probably exceeding the precision with which double can store integers. 您可能超出了double可以存储整数的精度。

My suggestion would be to change your code to use unsigned long long for the result; 我的建议是将代码更改为对结果使用unsigned long long and also add in a check for overflow here, eg: 并在此处添加溢出检查,例如:

unsigned long long n = 0; 
// ...

if ( n * 16 + val < n )  
{
    fprintf(stderr, "Number too big.\n");
    exit(EXIT_FAILURE);
}

n = n * 16 + val;

My less-than check works because when unsigned integer types overflow they wrap around to zero. 我的小于检查之所以有效,是因为当无符号整数类型溢出时,它们会归零。

If you want to add more precision than unsigned long long then you will have to get into more advanced techniques (probably beyond the scope of Ch. 2 of K&R but once you've finished the book you could revisit). 如果您想unsigned long long增加比unsigned long long更多的精度,那么您将不得不采用更高级的技术(可能超出了K&R第2章的范围,但是一旦完成本书,您就可以重新访问)。


NB. 注意 You also need to #include <stdlib.h> if you take my suggestion of exit ; 您还需要#include <stdlib.h> ,如果你把我的建议exit ; and don't forget to change %.0f to %llu in your final printf . 并且不要忘记在最终的printf %llu %.0f更改为%llu Also, a safer way to get the input (which K&R covers) is: 另外,获取输入(K&R涵盖)的一种更安全的方法是:

int c;
while((c = getchar()) != '\n' && c != EOF)

The first time I ran the code on ideone I got segfault, because I didn't put a newline on the end of the stdin so this loop kept on shoving EOF into hex until it buffer overflowed. 第一次在ideone上运行代码时,我遇到了段错误,因为我没有在stdin的末尾插入换行符,因此此循环一直将EOF推入hex直到缓冲区溢出为止。

This is a classic example of floating point inaccuracy. 这是浮点误差的经典示例。

Unlike most of the examples of floating point errors you'll see, this is clearly not about non-binary fractions or very small numbers; 与您将看到的大多数浮点错误示例不同,这显然与非二进制分数或非常小的数字无关; in this case, the floating point representation is approximating very big numbers, with the accuracy decreasing the higher you go. 在这种情况下,浮点表示近似于很大的数字,精度越高,精度越高。 The principle is the same as writing "1.6e10" to mean "approximately 16000000000" (I think I counted the zeros right there!), when the actual number might be 16000000001. 原理与写“ 1.6e10”的意思相同,意思是“大约16000000000”(我想我在那里算过零!),而实际数字可能是16000000001。

You actually run out of accuracy sooner than with an integer of the same size because only part of the width of a floating point variable can be used to represent a whole number. 实际上,比起使用相同大小的整数,精度实际上要早一点用完,因为浮点变量的宽度的一部分只能用于表示整数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM