[英]strcat adds junk to the string
I'm trying to reverse a sentence, without changing the order of words,我试图颠倒一个句子,而不改变单词的顺序,
For example: "Hello World" => "olleH dlroW"例如:"Hello World" => "olleH dlroW"
Here is my code:这是我的代码:
#include <stdio.h>
#include <string.h>
char * reverseWords(const char *text);
char * reverseWord(char *word);
int main () {
char *text = "Hello World";
char *result = reverseWords(text);
char *expected_result = "olleH dlroW";
printf("%s == %s\n", result, expected_result);
printf("%d\n", strcmp(result, expected_result));
return 0;
}
char *
reverseWords (const char *text) {
// This function takes a string and reverses it words.
int i, j;
size_t len = strlen(text);
size_t text_size = len * sizeof(char);
// output containst the output or the result
char *output;
// temp_word is a temporary variable,
// it contains each word and it will be
// empty after each space.
char *temp_word;
// temp_char is a temporary variable,
// it contains the current character
// within the for loop below.
char temp_char;
// allocating memory for output.
output = (char *) malloc (text_size + 1);
for(i = 0; i < len; i++) {
// if the text[i] is space, just append it
if (text[i] == ' ') {
output[i] = ' ';
}
// if the text[i] is NULL, just get out of the loop
if (text[i] == '\0') {
break;
}
// allocate memory for the temp_word
temp_word = (char *) malloc (text_size + 1);
// set j to 0, so we can iterate only on the word
j = 0;
// while text[i + j] is not space or NULL, continue the loop
while((text[i + j] != ' ') && (text[i + j] != '\0')) {
// assign and cast test[i+j] to temp_char as a character,
// (it reads it as string by default)
temp_char = (char) text[i+j];
// concat temp_char to the temp_word
strcat(temp_word, &temp_char); // <= PROBLEM
// add one to j
j++;
}
// after the loop, concat the reversed version
// of the word to the output
strcat(output, reverseWord(temp_word));
// if text[i+j] is space, concat space to the output
if (text[i+j] == ' ')
strcat(output, " ");
// free the memory allocated for the temp_word
free(temp_word);
// add j to i, so u can skip
// the character that already read.
i += j;
}
return output;
}
char *
reverseWord (char *word) {
int i, j;
size_t len = strlen(word);
char *output;
output = (char *) malloc (len + 1);
j = 0;
for(i = (len - 1); i >= 0; i--) {
output[j++] = word[i];
}
return output;
}
The problem is the line I marked with <= PROBLEM
, On the first word which in this case is "Hello", it does everything just fine.问题是我用
<= PROBLEM
标记的那一行,在本例中的第一个词是“你好”,它做的一切都很好。
On the second word which in this case is "World", It adds junky characters to the temp_word
, I checked it with gdb
, temp_char
doesn't contain the junk, but when strcat
runs, the latest character appended to the temp_word
would be something like W\\006
,在第二个字,在这种情况下是“世界”,它增加了一些假的字符到
temp_word
,我检查了它gdb
, temp_char
不含垃圾,但是当strcat
运行,追加到最新的字符temp_word
会是这样像W\\006
,
It appends \\006
to all of the characters within the second word,它将
\\006
附加到第二个单词中的所有字符,
The output that I see on the terminal is fine, but printing out strcmp
and comparting the result
with expected_result
returns -94
.我在终端上看到的输出很好,但打印出
strcmp
并将result
与expected_result
返回-94
。
\\006
character? \\006
字符是什么?strcat
adds it?strcat
添加它?The root cause of junk characters is you use wrong input for the 2nd argument of strcat function.垃圾字符的根本原因是您对 strcat 函数的第二个参数使用了错误的输入。 see explain below:
请参阅下面的解释:
At the beginning of your function you declare:在您的函数开始时,您声明:
int i, j;
size_t len = strlen(text);
size_t text_size = len * sizeof(char);
// output containst the output or the result
char *output;
// temp_word is a temporary variable,
// it contains each word and it will be
// empty after each space.
char *temp_word;
// temp_char is a temporary variable,
// it contains the current character
// within the for loop below.
char temp_char;
you can print variable's addresses in stack, they will be something like this:您可以在堆栈中打印变量的地址,它们将是这样的:
printf("&temp_char=%p,&temp_word=%p,&output=%p,&text_size=%p\n", &temp_char, &temp_word,&output,&text_size);
result:
&temp_char=0x7ffeea172a9f,&temp_word=0x7ffeea172aa0,&output=0x7ffeea172aa8,&text_size=0x7ffeea172ab0
As you can see, &temp_char(0x7ffeea172a9f) is at the bottom of the stack, next 1 byte is &temp_word(0x7ffeea172aa0), next 8 bytes is &output(0x7ffeea172aa8), and so on(I used 64bit OS, so it takes 8 bytes for a pointer)如您所见,&temp_char(0x7ffeea172a9f) 位于堆栈底部,接下来的 1 个字节是 &temp_word(0x7ffeea172aa0),接下来的 8 个字节是 &output(0x7ffeea172aa8),依此类推(我使用的是 64 位操作系统,因此需要 8 个字节一个指针)
// concat temp_char to the temp_word
strcat(temp_word, &temp_char); // <= PROBLEM
refer strcat description here: http://www.cplusplus.com/reference/cstring/strcat/在这里参考 strcat 描述: http : //www.cplusplus.com/reference/cstring/strcat/
the strcat second argument = &temp_char = 0x7ffeea172a9f. strcat 第二个参数 = &temp_char = 0x7ffeea172a9f。 strcat considers that &temp_char(0x7ffeea172a9f) is the starting point of the source string, instead of adding only one char as you expect it will append to temp_word all characters starting from &temp_char(0x7ffeea172a9f) , until it meets terminating null character
strcat 认为 &temp_char(0x7ffeea172a9f) 是源字符串的起始点,而不是像您期望的那样只添加一个字符,它会将所有从 &temp_char(0x7ffeea172a9f) 开始的字符附加到 temp_word ,直到遇到终止空字符
strcat()
expects addresses of the 1st character of "C"-strings, which in fact are char
-arrays with at least one element being equal to '\\0'
. strcat()
需要 "C" 字符串的第一个字符的地址,实际上是char
数组,其中至少有一个元素等于'\\0'
。
Neither the memory temp_word
points to nor the memory &temp_char
points to meet such requirements.无论是内存
temp_word
点,也不是内存&temp_char
点,满足这些要求。
Due to this the infamous undefined behaviour is invoked, anything can happen from then on.因此,调用了臭名昭著的未定义行为,从那时起任何事情都可能发生。
A possible fix would be to change一个可能的解决方法是改变
temp_word = (char *) malloc (text_size + 1);
to become成为
temp_word = malloc (text_size + 1); /* Not the issue but the cast is
just useless in C. */
temp_word[0] = '\0';
and this和这个
strcat(temp_word, &temp_char);
to become成为
strcat(temp_word, (char[2]){temp_char});
There might be other issues with the rest of the code.其余代码可能存在其他问题。
The function strcat deals with strings.函数 strcat 处理字符串。
In this code snippet在这个代码片段中
// assign and cast test[i+j] to temp_char as a character,
// (it reads it as string by default)
temp_char = (char) text[i+j];
// concat temp_char to the temp_word
strcat(temp_word, &temp_char); // <= PROBLEM
neither the pointer temp_word
nor the pointer &temp_char
points to a string.无论是指针
temp_word
也不指针&temp_char
点为字符串。
Moreover array output
is not appended with the terminating-zero character for example when the source string consists from blanks.此外,数组
output
不会附加终止零字符,例如当源字符串由空格组成时。
In any case your approach is too complicated and has many redundant code as for example the condition in the for loop and the condition in the if statement that duplicate each other.在任何情况下,您的方法都太复杂了,并且有许多冗余代码,例如 for 循环中的条件和 if 语句中的条件相互重复。
for(i = 0; i < len; i++) {
//…
// if the text[i] is NULL, just get out of the loop
if (text[i] == '\0') {
break;
}
The function can be written simpler as it is shown in the demonstrative program below.该函数可以写得更简单,如下面的演示程序所示。
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
char * reverse_words( const char *s )
{
char *result = malloc( strlen( s ) + 1 );
if ( result != NULL )
{
char *p = result;
while ( *s != '\0' )
{
while ( isblank( ( unsigned char )*s ) )
{
*p++ = *s++;
}
const char *q = s;
while ( !isblank( ( unsigned char )*q ) && *q != '\0' ) ++q;
for ( const char *tmp = q; tmp != s; )
{
*p++ = *--tmp;
}
s = q;
}
*p = '\0';
}
return result;
}
int main(void)
{
const char *s = "Hello World";
char *result = reverse_words( s );
puts( s );
puts( result );
free( result );
return 0;
}
The program output is程序输出是
Hello World
olleH dlroW
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.