[英]Seg fault on a program spliting a char array to a 2D array
I wrote a program to split a char array to a 2D array as the comments below the definition of the function states. 我编写了一个程序,将char数组拆分为2D数组,作为功能状态定义下方的注释。 However, I am receiving segmentation fault on this piece of code.
但是,我在这段代码上收到分段错误。 Can some one help to find why?
有人可以帮忙找到原因吗?
The my_strlen(str)
functions works the same way as the original strlen(str)
function, and it works perfectly. my_strlen(str)
函数的工作方式与原始strlen(str)
函数相同,并且效果很好。 And the length of the char
array is limited, so I do not really worry about the efficiency in memory allocation. 而且
char
数组的长度是有限的,因此我并不真正担心内存分配的效率。
char **my_str2vect(char *str) {
// Takes a string
// Allocates a new vector (array of string ended by a NULL),
// Splits apart the input string x at each space character
// Returns the newly allocated array of strings
// Any number of ' ','\t', and '\n's can separate words.
// I.e. "hello \t\t\n class,\nhow are you?" -> {"hello", "class,", "how", "are","you?", NULL}
int str_len = my_strlen(str);
char **output = malloc(sizeof(char *) * str_len); // Allocate a 2D array first.
if (**output) {
for (int a = 0; a < str_len; ++a) {
output[a] = malloc(sizeof(char) * str_len);
}
} else {
return NULL;
}
int i = 0;
int j = 0;
while (i < str_len) { // Put the characters into the 2D array.
int k = 0;
while ((str[i] == ' ' || str[i] == '\t' || str[i] == '\n') && i < str_len) {
++i;
}
while ((!(str[i] == ' ' || str[i] == '\t' || str[i] == '\n')) && i < str_len) {
output[j][k] = str[i];
++i;
++k;
}
output[j][k] = '\0';
++j;
}
output[j] = NULL;
return output;
}
To correct your code change if (**output)
to if (output)
. 要更正您的代码,
if (**output)
更改为if (output)
。
I think your implementation is not memory efficient and could be more elegant. 我认为您的实现内存效率不高,可能会更优雅。
You are allocating too much memory. 您正在分配过多的内存。 I tried to explain in the code the upper bound for the size of output char pointers.
我试图在代码中解释输出char指针大小的上限。 If you want to have the exact size, you'll have to count words in the string.
如果您想要精确的大小,则必须计算字符串中的单词数。 It's probably better to do it that way, but for the exercise I think we can go the easier way.
这样做可能更好,但是对于练习,我认为我们可以走更简单的方法。
As to your code I can only say: 至于您的代码,我只能说:
'\\0'
anywhere, which is a bad sign '\\0'
的结尾,这是一个不好的信号 Please see below an improved implementation (I use the standard C89): 请在下面查看改进的实现(我使用标准C89):
#include<stdio.h>
#include <string.h>
#include<stdlib.h>
char** my_str2vect(char* s) {
// Takes a string
// Allocates a new vector (array of string ended by a NULL),
// Splits apart the input string x at each space character
// Returns the newly allocated array of strings
// Any number of ' ','\t', and '\n's can separate words.
// I.e. "hello \t\t\n class,\nhow are you?" -> {"hello", "class,", "how", "are","you?", NULL}
int s_size = strlen(s);
/*
* size of output is 1 if string contains non delimiters only
* size of output is 0 if string contains delimiters only
* size of output is strlen / 2 if string contains ...
* ...alternation of delimiter and non delimiter, and that is the max size
* so we allocate that size (upper bound)
*/
int max_output_size = (s_size / 2) + 1;
char **output = (char **) malloc(sizeof (char *) * max_output_size);
//initialize to NULL for convenience
int k;
for (k = 0; k < max_output_size; k++)
output[k] = NULL;
//work on a copy of s
char *str = (char *) malloc(s_size + 1);
strcpy(str, s);
//pointer for token and delimiters
char *ptr;
char delimiter[] = "\n\t ";
//initialize and create first token
ptr = strtok(str, delimiter);
//
int i = 0;
while (ptr != NULL) {
//allocate memory and copy token
output[i] = malloc(sizeof (char) * strlen(ptr) + 1);
strcpy(output[i], ptr);
//get next token
ptr = strtok(NULL, delimiter);
//increment
i++;
}
return output;
}
int main(int argc, char *argv[]) {
char **result = my_str2vect("hello \t\t\n class,\nhow are you?");
int i;
for (i = 0; result[i] != NULL; i++)
printf("%s\n", result[i]);
return 0;
}
I have tried to use gdb
to determine the problem. 我试图使用
gdb
来确定问题。 It is about
**output
control. 关于
**output
控制。 You should check address of *output
instead of where pointer to pointer to. 您应该检查
*output
地址,而不是指向指针的位置。 You are allocating places in for loop until length of the string. 您正在for循环中分配位置,直到字符串的长度为止。 It may cause defragmentation.
可能会导致碎片整理。 Moreover, the 1D
char
array should be passed by const
to be not changeable . 此外,一维
char
数组应通过const
传递, 以不可更改 。 Instead, you should use the snippet 相反,您应该使用代码段
// allocation (in the function)
// protoype: char** my_str2vect(char const* str)
int a;
char** output = malloc(str_len * sizeof(char *));
output[0] = malloc(str_len * str_len * sizeof(char));
for(a = 1; a < str_len; a++)
output[a] = output[0] + a * str_len;
// freeing (in main())
char ** x;
char const* str = "hello \t\t\n class,\nhow are you?";
x = my_str2vect(str);
free((void *)x[0]);
free((void *)x);
En passant, the source aids to get more knowledge about allocation. 总而言之, 该资源有助于获得有关分配的更多知识。
As the debugger is telling you the if (**output)
is broken. 正如调试器告诉您的
if (**output)
是否损坏。 It's trying to dereference the pointer in the first output array location. 它试图取消对第一个输出数组位置中的指针的引用。 That's junk at the point of the
if
. if
是if
那真是垃圾。 Hence, the seg fault. 因此,段故障。 You want
if (output)
. 您想要
if (output)
。 When I fix this and use strlen
in place of your rewrite, it seems to work fine. 当我修复此问题并使用
strlen
代替您的重写时,它似乎工作正常。
It's considerably simpler to make one copy of the input string and use this for all the strings in the returned vector. 复制输入字符串的一个副本并将其用于返回向量中的所有字符串,这要简单得多。 You can also use
strtok
to find the words, but that's not thread safe. 您也可以使用
strtok
查找单词,但这不是线程安全的。
Here's a suggestion: 这是一个建议:
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>
char **split(char *s_org) {
size_t i;
// Skip initial whitespace, then copy everything else.
for (i = 0; s_org[i] && isspace(s_org[i]); ++i) /* skip */;
char *s = strdup(s_org + i);
size_t n_rtn = 0, size = 0;
char **rtn = malloc(sizeof *rtn);
for (i = 0;;) {
if (!s[i]) {
rtn[n_rtn] = NULL;
return realloc(rtn, (n_rtn + 1) * sizeof *rtn);
}
if (n_rtn == size) {
size = 2 * size + 1;
rtn = realloc(rtn, size * sizeof *rtn);
}
rtn[n_rtn++] = s + i;
while (s[i] && !isspace(s[i])) ++i;
if (s[i]) {
s[i++] = '\0';
while (isspace(s[i])) ++i;
}
}
}
int main(void) {
char **rtn = split(" hello \t\t\n class,\nhow are you?");
for (char **p = rtn; *p; ++p)
printf("%s\n", *p);
// Freeing the first element frees all strings (or does nothing if none)
free(rtn[0]);
free(rtn);
return 0;
}
This omits checks for NULL
returns from malloc
and realloc
. 这省略了对
malloc
和realloc
NULL
返回的检查。 But they're easy to add. 但是它们很容易添加。
You asked about the "other problems" with your code. 您询问了代码的“其他问题”。 I've fixed some here:
我在这里固定了一些:
size_t
to index arrays. size_t
索引数组。 malloc
. malloc
。 strlen
when simple checks for the terminating NULL
are easier. NULL
简单检查比较容易时,请避免strlen
。 FOO *p = malloc(sizeof *p);
FOO *p = malloc(sizeof *p);
to allocate a FOO
. FOO
。 It's less error prone than sizeof(FOO)
. sizeof(FOO)
容易出错。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.