[英]Splitting string into words in array without using any pre-made functions in C
I am trying to create a function that takes a string
, splits it into words
and return an array
with the words
in it.我正在尝试创建一个函数,该函数接受一个
string
,将其拆分为words
并返回一个包含words
的array
。 I am not allowed to use any pre-made functions other than malloc
within the splitting
function.我不允许在
splitting
函数中使用除malloc
之外的任何预制函数。 Finally I have to set my function in this form char **ft_split_whitespaces(char *str)
My current output looks like that:最后,我必须以这种形式设置我的函数
char **ft_split_whitespaces(char *str)
我当前的输出如下所示:
d this is me
s is me
s me
r
Expected output:预期输出:
Hello
World
This
Is
Me
my full code is in the following codes:我的完整代码在以下代码中:
#include <stdio.h>
#include <stdlib.h>
int count_words(char *str)
{
int i;
int word;
i = 0;
word = 1;
while(str[i]!='\0')
{
if(str[i]==' ' || str[i]=='\n' || str[i]=='\t'
|| str[i]=='\f' || str[i]=='\r' || str[i]=='\v')
word++;
i++;
}
return (word);
}
char **ft_split_whitespaces(char *str)
{
int index;
int size;
int index2;
char **arr;
index = 0;
index2 = 0;
size = count_words(str);
arr = (char **)malloc(size * sizeof(char));
if (arr == NULL)
return ((char **)NULL);
while (str[index])
{
if(str[index] == ' ')
{
index++;
value++;
index2++;
}
else
*(arr+index2) = (char*) malloc(index * sizeof(char));
*(arr+index2) = &str[index];
index++;
}
**arr = '\0';
return (arr);
}
int main()
{
char a[] = "Hello World This Is Me";
char **arr;
int i;
int ctr = count_words(a);
arr = ft_split_whitespaces(a);
for(i=0;i < ctr;i++)
printf("%s\n",arr[i]);
return 0;
}
You have quite a few errors in your program:你的程序有不少错误:
arr = (char **)malloc(size * sizeof(char));
is not right since arr
is of type char**
.是不对的,因为
arr
是char**
类型。 You should use sizeof(char*)
or better (sizeof(*arr))
since sizeof(char)
is usually not equal to sizeof(char*)
for modern systems.您应该使用
sizeof(char*)
或更好的(sizeof(*arr))
因为sizeof(char)
通常不等于现代系统的sizeof(char*)
。
You don't have braces {}
around your else
statement in ft_split_whitespaces
which you probably intended.在您可能想要的
ft_split_whitespaces
,您的else
语句周围没有大括号{}
。 So your conditional logic breaks.所以你的条件逻辑中断了。
You are allocating a new char[]
for every non--whitespace character in the while
loop.您正在为
while
循环中的每个非空白字符分配一个新的char[]
。 You should only allocate one for every new word and then just fill in the characters in that array.您应该只为每个新单词分配一个,然后只填写该数组中的字符。
*(arr+index2) = &str[index];
This doesn't do what you think it does.这并不像你认为的那样。 It just points the string at
*(arr+index2)
to str
offset by index
.它只是将
*(arr+index2)
处的字符串指向str
偏移的index
。 You either need to copy each character individually or do a memcpy()
(which you probably can't use in the question).您要么需要单独复制每个字符,要么执行
memcpy()
(您可能无法在问题中使用)。 This explains why your answer just provides offsets into the main string and not the actual tokens.这解释了为什么您的答案只提供主字符串的偏移量而不是实际的标记。
**arr = '\\0';
You will lose whatever you store in the 0th
index of arr
.您将丢失存储在
arr
的0th
索引中的任何内容。 You need to individually append a \\0
to each string in arr
.您需要将
\\0
单独附加到arr
每个字符串。
*(arr+index2) = (char*) malloc(index * sizeof(char));
You will end up allocating progressively increasing size of char
arrays at because you are using index
for the count of characters, which keeps on increasing.您最终将分配逐渐增加的
char
数组大小,因为您使用index
来计算字符数,该数会不断增加。 You need to figure out the correct length of each token in the string and allocate appropriately.您需要找出字符串中每个标记的正确长度并进行适当分配。
Also why *(arr + index2)
?还有为什么
*(arr + index2)
? Why not use the much easier to read arr[index2]
?为什么不使用更容易阅读的
arr[index2]
呢?
Further clarifications:进一步说明:
Consider str = "abc de"
考虑
str = "abc de"
You'll start with你将从
*(arr + 0) = (char*) malloc(0 * sizeof(char));
//ptr from malloc(0) shouldn't be dereferenced and is mostly pointless (no pun), probably NULL
*(arr + 0) = &str[0];
Here str[0] = 'a'
and is a location somehwhere in memory, so on doing &str[0]
, you'll store that address in *(arr + 0)
这里
str[0] = 'a'
并且是内存中某处的位置,因此执行&str[0]
,您将将该地址存储在*(arr + 0)
Now in the next iteration, you'll have现在在下一次迭代中,您将拥有
*(arr + 0) = (char*) malloc(1 * sizeof(char));
*(arr + 0) = &str[1];
This time you replace the earlier malloc'd array at the same index2
again with a different address.这次您再次用不同的地址替换同一
index2
处较早的 malloc 数组。 In the next iterations *(arr + 0) = (char*) malloc(2 * sizeof(char));
在接下来的迭代中
*(arr + 0) = (char*) malloc(2 * sizeof(char));
. . You end up resetting the same
*(arr + index2)
position till you encounter a whitespace after which you do the same thing again for the next word.您最终会重置相同的
*(arr + index2)
位置,直到遇到空格,然后对下一个单词再次执行相同的操作。 So don't allocate arrays for every index
value but only if and when required.所以不要为每个
index
值分配数组,而只是在需要时才分配数组。 Also, this shows that you'll keep on increasing the size passed to malloc
with the increasing value of index
which is what #6 indicated.此外,这表明您将继续增加传递给
malloc
的大小,而index
值不断增加,这就是 #6 所指示的。
Coming to &str[index]
.来到
&str[index]
。
You are setting (arr + index2)
ie a char*
(pointer to char
) to another char*
.您正在设置
(arr + index2)
即一个char*
(指向char
指针)到另一个char*
。 In C, setting a pointer to another pointer doesn't copy the contents of the second pointer to the first, but only makes both of them point to the same memory location.在 C 中,将指针设置为另一个指针不会将第二个指针的内容复制到第一个指针,而只会使它们指向同一内存位置。 So when you set something like
*(arr + 1) = &str[4]
, it's just a pointer into the original string at index = 4
.因此,当您设置诸如
*(arr + 1) = &str[4]
,它只是指向index = 4
处原始字符串的指针。 If you try to print this *(arr + 1)
you'll just get a substring from index = 4
to the end of the string, not the word you're trying to obtain.如果你尝试打印这个
*(arr + 1)
你只会得到一个从index = 4
到字符串末尾的子串,而不是你想要得到的词。
**arr = '\\0'
is just dereferencing the pointer at *arr
and setting its value to \\0
. **arr = '\\0'
只是取消引用*arr
处的指针并将其值设置为\\0
。 So imagine if you had *(arr + 0) = "hello\\0"
, you'll set it to "\\0ello\\0"
.所以想象一下,如果你有
*(arr + 0) = "hello\\0"
,你会将它设置为"\\0ello\\0"
。 If you're ever iterating over this string, you'll never end up traversing beyond the first '\\0'
character.如果你曾经遍历过这个字符串,你将永远不会遍历第一个
'\\0'
字符。 Hence you lose whatever *arr
was earlier pointing to.因此,您丢失了
*arr
之前指向的任何内容。
Also, *(arr + i)
and arr[i]
are exactly equivalent and make for much better readability.此外,
*(arr + i)
和arr[i]
完全等效,并且具有更好的可读性。 It better conveys that arr
is an array and arr[i]
is dereferencing the i
th element.它更好地传达了
arr
是一个数组,而arr[i]
正在取消引用第i
个元素。
Here is how I would do it:这是我将如何做到的:
#include <stdio.h> // printf
#include <stdlib.h> // malloc
// this returns an array of pointers to strings
// that is one longer than the number of strings
// the last item in the array is always NULL
// so that the caller can tell when they get to the end
// we have to do this because we have no way
// to return the size of the finished array
char **ft_split_whitespaces(char *str)
{
/*
* First count the number of pieces in the string
*/
// there will always be a NULL at the end of the array
int size = 1;
// if the string isn't empty there is one piece after the last space
if (*str != '\0')
size++;
// there will be one piece for the bit before each space
// so loop through the string looking for a space
for (char *pointer = str; *pointer != '\0'; pointer++)
if (*pointer == ' ')
size++;
/*
* Now allocate the array of items that will be returned
*/
char **array = malloc(size * sizeof(char*));
if (array == NULL) return NULL; // ERROR: return "Something really bad happened!"
/*
* Then split the string into items and store them into the array
*/
// index is where the piece will be stored
int index=0;
// we need two pointers:
// - one for where we currently are
// - and one for what we are looking for
char *current=str, *next=str;
// we are done if we are at the end of the string
while (*current != '\0')
{
// find a space character
// but stop looking if we find the end of the string instead
while (*next!='\0' && *next!=' ')
next++;
// now allocate enough space for this piece
char *piece = malloc(next - current + 1);
if (piece == NULL) break; // ERROR: exit the loop and return array as it is
// and copy the piece into the memory
for (int i=0; i<next-current; i++)
piece[i] = current[i];
// then terminate the string
piece[next-current] = '\0';
// store the new piece and increase the index
array[index++] = piece;
// now we are done with that piece
// so start looking for the enxt one
current = ++next;
}
// make sure the array ends with a NULL;
array[index] = NULL;
// return the new array
return array;
}
int main()
{
char **items = ft_split_whitespaces("Hello World This Is Me");
if (items == NULL)
printf("Something really bad happened!");
else // loop through the array until we find a NULL
for (char **pointer = items; *pointer != NULL; pointer++)
printf("%s\n", *pointer);
return 0;
}
Try it at https://onlinegdb.com/MhXIUQdo0在https://onlinegdb.com/MhXIUQdo0尝试一下
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.