strlen 在 c 中返回错误答案

Question

I am writing a program which convert decimal number to roman number.我正在编写一个将十进制数转换为罗马数的程序。 I use 4 arrays thousands , hundreds , tens , units to store digits in roman number, then copy each digit to res array, I use str pointer to track where the string in res begin.我使用 4 arrays thousands 、 hundreds 、 tens units来存储罗马数字中的数字，然后将每个数字复制到res数组，我使用str指针来跟踪res中字符串的开始位置。 When I test with input 128 , it prints out CXXVIIIIX which must be CXXVIII .当我使用输入128进行测试时，它会打印出CXXVIIIIX ，它必须是CXXVIII 。 I have tried to debug and got the result when tmp=8 is strlen(units[tmp-1]) = 6 , which means strlen also counts IX .当tmp=8为strlen(units[tmp-1]) = 6时，我尝试调试并得到结果，这意味着strlen也计入IX 。 And for some case like 3888 , the program prints out trash value.对于某些情况，例如3888 ，程序会打印出垃圾值。

This is my code这是我的代码

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <stdbool.h>
#include <string.h>
#include <windows.h>

int main(){
    int n=128;

    char thousands[][4]={"M","MM","MMM"};
    char hundreds[][5]={"C","CC","CCC","CD", "D", "DC", "DCC", "DCCC", "CM"};
    char tens[][5]={"X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC"};
    char units[][5]={"I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX"};

    char res[16];
    res[15]='\0';    //add nul to the last character of res array
    char *str=res+14;   //str tracks where the string start
    int tmp;
    
    
    //store roman digits in res array in reverse order, 
    //start from units->tens->hundreds->thousands
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(units[tmp-1]);  //str steps back several address to store new digits
            strncpy(str, units[tmp-1], strlen(units[tmp-1]));   //use strncpy to copy without '\0'
            
        }
        n/=10;
    }
    
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(tens[tmp-1]);
            strncpy(str, tens[tmp-1], strlen(tens[tmp-1]));
            
        }
        n/=10;
    }
    
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(hundreds[tmp-1]);
            strncpy(str, hundreds[tmp-1], strlen(hundreds[tmp-1]));
            
        }
        n/=10;
    }
    
    if (n!=0)
    {
        tmp=n%10;
        if (tmp!=0)
        {
            str-=strlen(thousands[tmp-1]);
            strncpy(str, thousands[tmp-1], strlen(thousands[tmp-1]));
            
        }
        n/=10;
    }
    
    printf("%s", str);
    return 0;
}

So why does this happen and how to fix it?那么为什么会发生这种情况以及如何解决呢？

Any help would be appreciated.任何帮助，将不胜感激。

Answer 1

After fixing the array size for all string literals, there is an off-by-one error and an uninitialized char array element.修复所有字符串文字的数组大小后，会出现非一错误和未初始化的 char 数组元素。

char res[16]; would be enough for "MMMDCCCLXXXVIII" with a trailing '\0' .对于带有尾随'\0'的"MMMDCCCLXXXVIII"来说就足够了。
With res[15]='\0';与res[15]='\0'; you store the string termination in the last element.您将字符串终止存储在最后一个元素中。
char *str=res+14; sets the pointer to the uninitialized character before this.在此之前设置指向未初始化字符的指针。
str-=strlen(something) moves the pointer according to the length of the string to insert. str-=strlen(something)根据要插入的字符串的长度移动指针。 The following strncpy will not overwrite the uninitialized character.下面的strncpy不会覆盖未初始化的字符。

The result will always contain an uninitialized trailing character which may or may not be visible.结果将始终包含一个未初始化的尾随字符，该字符可能可见也可能不可见。

A result of the maximum length ( "MMMDCCCLXXXVIII" ) will begin one character before the first array element because of the trailing uninitialized character.由于尾随未初始化字符，最大长度 ( "MMMDCCCLXXXVIII" ) 的结果将在第一个数组元素之前开始一个字符。 You will inh fact have a result string "MMMDCCCLXXXVIII*" where * is uninitialized.实际上，您将得到一个结果字符串"MMMDCCCLXXXVIII*" ，其中*未初始化。

Example to demonstrate the off-by-one error:演示非一错误的示例：

Note that this is not the full code but only intended to show how the variable values change with input value n=3888 .请注意，这不是完整的代码，而只是为了显示变量值如何随输入值n=3888变化。 (For example the if (tmp!=0) guards from the original code are missing here.) （例如，这里缺少原始代码中的if (tmp!=0)保护。）

int n=3888;

char res[16];
// * indicates an uninitialized character
res[15]='\0';  // "***************\0"
char *str=res+14; // str = &res[14]
int tmp;

tmp=n%10; // tmp = 8
str-=strlen(units[tmp-1]); // "VIII" (strlen = 4) -> str = &res[10]
strncpy(str, units[tmp-1], strlen(units[tmp-1])); // "**********VIII*\0"
n/=10; // n = 388

tmp=n%10; // tmp = 8
str-=strlen(tens[tmp-1]); // "LXXX" (strlen = 4) -> str = &res[6]
strncpy(str, tens[tmp-1], strlen(tens[tmp-1])); // ******LXXXVIII*\0"
n/=10; // n = 38

/* ...*/

// final result would be

// str = &res[-1]
// str -> "MMMDCCCLXXXVIII*\0"
// res =   "MMDCCCLXXXVIII*\0"

// The last strncpy tries to write a character 'M' one element before the beginning of the array res.

As a fix I propose作为修复我建议

char res[16] = {0}; // initialize the whole array with 0
char *str=res+(sizeof(res)-1);

Answer 2

... and how to fix it? ...以及如何解决它？

Consider writing units in 1,000s, then 100s, tens, ones order.考虑以 1,000 为单位，然后以 100、10、1 为单位。
Add a "zero" to each unit list.在每个单元列表中添加一个“零”。
Roll into a helper function.滚入一个助手 function。
Change types from character arrays to arrays of pointers.将指针的类型从字符 arrays 更改为 arrays。

Example例子

void print_roman(int n) {
  static const char *thousands[] = {"", "M", "MM", "MMM"};
  static const char *hundreds[] = {"", "C", "CC", "CCC", "CD", "D", "DC", "DCC", "DCCC", "CM"};
  static const char *tens[] = {"", "X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC"};
  static const char *ones[] = {"", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX"};
  static const char **units[] = {thousands, hundreds, tens, ones};
  int units_n = sizeof units / sizeof units[0];

  assert(n > 0 && n < 4000);
  char res[16];
  char *p = res;
  int multiplier = 1000;
  for (int i = 0; i < units_n; i++) {
    strcpy(p, units[i][n / multiplier]);
    p += strlen(p);
    n %= multiplier;
    multiplier /= 10;
  }
  printf("<%s>\n", res);
}

int main() {
  print_roman(3888);
  print_roman(3999);
  print_roman(1010);
  print_roman(42);
}

Output Output

<MMMDCCCLXXXVIII>
<MMMCMXCIX>
<MX>
<XLII>

strlen 在 c 中返回错误答案

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-01-10 12:17:44

解决方案2
0 2022-01-10 17:02:53

strlen 在 c 中返回错误答案

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-01-10 12:17:44

解决方案2 0 2022-01-10 17:02:53

解决方案1
1 已采纳 2022-01-10 12:17:44

解决方案2
0 2022-01-10 17:02:53