简体   繁体   English

strlen 在 c 中返回错误答案

[英]strlen return wrong answer in c

I am writing a program which convert decimal number to roman number.我正在编写一个将十进制数转换为罗马数的程序。 I use 4 arrays thousands , hundreds , tens , units to store digits in roman number, then copy each digit to res array, I use str pointer to track where the string in res begin.我使用 4 arrays thousandshundredstens units来存储罗马数字中的数字,然后将每个数字复制到res数组,我使用str指针来跟踪res中字符串的开始位置。 When I test with input 128 , it prints out CXXVIIIIX which must be CXXVIII .当我使用输入128进行测试时,它会打印出CXXVIIIIX ,它必须是CXXVIII I have tried to debug and got the result when tmp=8 is strlen(units[tmp-1]) = 6 , which means strlen also counts IX .tmp=8strlen(units[tmp-1]) = 6时,我尝试调试并得到结果,这意味着strlen也计入IX And for some case like 3888 , the program prints out trash value.对于某些情况,例如3888 ,程序会打印出垃圾值。

This is my code这是我的代码

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <stdbool.h>
#include <string.h>
#include <windows.h>

int main(){
    int n=128;

    char thousands[][4]={"M","MM","MMM"};
    char hundreds[][5]={"C","CC","CCC","CD", "D", "DC", "DCC", "DCCC", "CM"};
    char tens[][5]={"X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC"};
    char units[][5]={"I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX"};

    char res[16];
    res[15]='\0';    //add nul to the last character of res array
    char *str=res+14;   //str tracks where the string start
    int tmp;
    
    
    //store roman digits in res array in reverse order, 
    //start from units->tens->hundreds->thousands
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(units[tmp-1]);  //str steps back several address to store new digits
            strncpy(str, units[tmp-1], strlen(units[tmp-1]));   //use strncpy to copy without '\0'
            
        }
        n/=10;
    }
    
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(tens[tmp-1]);
            strncpy(str, tens[tmp-1], strlen(tens[tmp-1]));
            
        }
        n/=10;
    }
    
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(hundreds[tmp-1]);
            strncpy(str, hundreds[tmp-1], strlen(hundreds[tmp-1]));
            
        }
        n/=10;
    }
    
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(thousands[tmp-1]);
            strncpy(str, thousands[tmp-1], strlen(thousands[tmp-1]));
            
        }
        n/=10;
    }
    
    printf("%s", str);
    return 0;
}

So why does this happen and how to fix it?那么为什么会发生这种情况以及如何解决呢?

Any help would be appreciated.任何帮助,将不胜感激。

After fixing the array size for all string literals, there is an off-by-one error and an uninitialized char array element.修复所有字符串文字的数组大小后,会出现非一错误和未初始化的 char 数组元素。

  • char res[16]; would be enough for "MMMDCCCLXXXVIII" with a trailing '\0' .对于带有尾随'\0'"MMMDCCCLXXXVIII"来说就足够了。
  • With res[15]='\0';res[15]='\0'; you store the string termination in the last element.您将字符串终止存储在最后一个元素中。
  • char *str=res+14; sets the pointer to the uninitialized character before this.在此之前设置指向未初始化字符的指针。
  • str-=strlen(something) moves the pointer according to the length of the string to insert. str-=strlen(something)根据要插入的字符串的长度移动指针。 The following strncpy will not overwrite the uninitialized character.下面的strncpy不会覆盖未初始化的字符。

The result will always contain an uninitialized trailing character which may or may not be visible.结果将始终包含一个未初始化的尾随字符,该字符可能可见也可能不可见。

A result of the maximum length ( "MMMDCCCLXXXVIII" ) will begin one character before the first array element because of the trailing uninitialized character.由于尾随未初始化字符,最大长度 ( "MMMDCCCLXXXVIII" ) 的结果将在第一个数组元素之前开始一个字符。 You will inh fact have a result string "MMMDCCCLXXXVIII*" where * is uninitialized.实际上,您将得到一个结果字符串"MMMDCCCLXXXVIII*" ,其中*未初始化。

Example to demonstrate the off-by-one error:演示非一错误的示例

Note that this is not the full code but only intended to show how the variable values change with input value n=3888 .请注意,这不是完整的代码,而只是为了显示变量值如何随输入值n=3888变化。 (For example the if (tmp!=0) guards from the original code are missing here.) (例如,这里缺少原始代码中的if (tmp!=0)保护。)

int n=3888;

char res[16];
// * indicates an uninitialized character
res[15]='\0';  // "***************\0"
char *str=res+14; // str = &res[14]
int tmp;

tmp=n%10; // tmp = 8
str-=strlen(units[tmp-1]); // "VIII" (strlen = 4) -> str = &res[10]
strncpy(str, units[tmp-1], strlen(units[tmp-1])); // "**********VIII*\0"
n/=10; // n = 388

tmp=n%10; // tmp = 8
str-=strlen(tens[tmp-1]); // "LXXX" (strlen = 4) -> str = &res[6]
strncpy(str, tens[tmp-1], strlen(tens[tmp-1])); // ******LXXXVIII*\0"
n/=10; // n = 38

/* ...*/

// final result would be

// str = &res[-1]
// str -> "MMMDCCCLXXXVIII*\0"
// res =   "MMDCCCLXXXVIII*\0"

// The last strncpy tries to write a character 'M' one element before the beginning of the array res.

As a fix I propose作为修复我建议

char res[16] = {0}; // initialize the whole array with 0
char *str=res+(sizeof(res)-1);

... and how to fix it? ...以及如何解决它?

  • Consider writing units in 1,000s, then 100s, tens, ones order.考虑以 1,000 为单位,然后以 100、10、1 为单位。

  • Add a "zero" to each unit list.在每个单元列表中添加一个“零”。

  • Roll into a helper function.滚入一个助手 function。

  • Change types from character arrays to arrays of pointers.将指针的类型从字符 arrays 更改为 arrays。

Example例子

void print_roman(int n) {
  static const char *thousands[] = {"", "M", "MM", "MMM"};
  static const char *hundreds[] = {"", "C", "CC", "CCC", "CD", "D", "DC", "DCC", "DCCC", "CM"};
  static const char *tens[] = {"", "X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC"};
  static const char *ones[] = {"", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX"};
  static const char **units[] = {thousands, hundreds, tens, ones};
  int units_n = sizeof units / sizeof units[0];

  assert(n > 0 && n < 4000);
  char res[16];
  char *p = res;
  int multiplier = 1000;
  for (int i = 0; i < units_n; i++) {
    strcpy(p, units[i][n / multiplier]);
    p += strlen(p);
    n %= multiplier;
    multiplier /= 10;
  }
  printf("<%s>\n", res);
}

int main() {
  print_roman(3888);
  print_roman(3999);
  print_roman(1010);
  print_roman(42);
}

Output Output

<MMMDCCCLXXXVIII>
<MMMCMXCIX>
<MX>
<XLII>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM