简体   繁体   中英

How to allocate memory in C for each character at a time for double pointers char**

My question is how to allocate memory for each character of a word at a time when we use char**. I am trying to do it like that

 words = malloc(sizeof(char*) * numberOfWords);

and when I am ready to copy characters:

int i = 0;
while(some condition){
 (*words)[i] = malloc(1); //here I am getting error "Incompatible pointer to integer conversion assigning to char from void*"
 arrayOfWords++;
 i++;
}

So what I want to do is to allocate for each character its space at a time rather then do it for some fixed max amount of bytes. Lengths of words may seriously vary

Thanks in advance!

Problems I see:

The line:

*words = malloc(sizeof(char*) * numberOfWords);

should be:

words = malloc(sizeof(char*) * numberOfWords);

The line:

(*words)[i] = malloc(1);

should be:

words[i] = malloc(1);

The type of words[i] is char* . The type of (*words)[i] is char . The compiler is complaining about the assignment of a pointer (the return value of malloc ) to a char .

It sounds like you want a dynamic string that automatically increases its length each time you add a character to it. Doing this in C is fairly complex and requires understanding dynamic memory allocation and data structures.

One strategy is to first allocate a fixed block of memory and gradually add data (in this case characters) to it. Once you fill up the allocated memory, you need to allocate a new, larger block of memory (often twice the size of the original block) and copy the data over to the new block of memory.

Note that this strategy does not increase the size of the allocated memory by a single byte at a time. Doing so would be incredibly inefficient.

This is more or less what standard data structures do in other languages and/or libraries (ie std::string and std::vector in C++ and ArrayList in Java).

You can make a forever growing string pretty easy...

    #define STRING_CHUNK_SIZE 100
    typedef struct 
    {
        char* s;
        unsigned int size;
        unsigned int allocated_size;
    } string;

    void string_create(string* s)
    {
        s->s = malloc(STRING_CHUNK_SIZE);
        s->s[0] = 0;
        s->size = 0;    
        s->allocated_size = STRING_CHUNK_SIZE;
    }

    void string_add(string* s, char* str)
    {
        int len = strlen(str);
        if(s->size + len + 1 >= s->allocated_size)
        {
            int room = s->allocated_size - s->size;
            int needed = len+1-room;
            int togrow = needed / STRING_CHUNK_SIZE;
            if(needed % STRING_CHUNK_SIZE) togrow += STRING_CHUNK_SIZE;
            s->allocated_size += togrow;
            s->s = realloc(s->s, s->allocated_size);
        }
        s->size += len;
        strcat(s->s, str);
    }

    char* string_p(string* s)
    {
        return s->s;
    }

void string_destroy(string* s)
{
    free(s->s);
}

you can then use something like

int i;
string s;
string_create(&s);
string_add(&s, "blah");
printf("%s\r\n", string_p(&s));
for(i = 0; i<100; i++)
{
    string_add(&s, "blah");
}
printf("%s\r\n", string_p(&s));
string_destory(&s);

If you REALLY REALLY want allocation of 1 byte at a time, then change the #define chunk size to 1

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char **copy_words(char *src_words[], int numberOfWords){
    char **words = malloc(sizeof(char*) * numberOfWords);
    int i = 0;
    while(i < numberOfWords){
        words[i] = malloc(strlen(src_words[i]) + 1);
        strcpy(words[i], src_words[i]);
        i++;
    }
    return words;
}

int main(void){
    char *words[] = { "first", "second", "..end"};
    int numOfWords = sizeof(words)/sizeof(*words);
    char **clone = copy_words(words, numOfWords);
    int i;
    for(i = 0; i < numOfWords; ++i){
        printf("%s\n", clone[i]);
        free(clone[i]);
    }
    free(clone);
    return 0;
}

This has become a little longer than expected, but here is then the promised example for how to allocate a list of words. To keep the example simple, let's assume we want to split a string at white space. Given a NUL terminated character string, we want to create a NULL terminated array of NUL terminated character strings, representing the individual words. No memory should be wasted.

To give a complete working example (just concatenate the following code blocks), here are the headers we will need:

#include <ctype.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

We are going to allocate some memory, so we'd better think about freeing it correctly. Let's start with that:

void
free_words(char * * words)
{
  /* Treat NULL gracefully for consistency with the standard libary's free(). */
  if (words != NULL)
    {
      size_t i;
      for (i = 0; words[i] != NULL; ++i)
        free(words[i]);
      free(words);
    }
}

Note that we require that the array of words be NULL terminated. Otherwise, we'd have no chance to tell where it ends.

Now to the tokenizer. We could loop over the string twice and count how many words it has in advance. But in general, this might not be possible, for example, if the string is actually a continuous input stream. I also wanted to show how an array can grow dynamically. If growing an array, we should always increase its size by a multiplicative factor (usually 2) in order to maintain linear amortized asymptotic complexity. (If you don't know what “linear amortized asymptotic complexity” means, just take the recommended procedure as a best practice.)

char * *
tokenize(const char *const sentence)
{
  size_t capacity = 1;
  size_t word_count = 0;
  ssize_t word_start = -1;
  size_t i = 0;
  char * * words = malloc(capacity * sizeof(char *));
  char * * temp;
  if (words == NULL)
    goto fail;
  words[word_count] = NULL;
  do
    {
      if (isspace(sentence[i]) || sentence[i] == '\0')
        {
          if (word_start >= 0)
            {
              /* We have found the end of the current word. */
              const size_t word_length = i - word_start;
              char * word;
              if (word_count + 1 >= capacity)
                {
                  /* We need to grow the array. */
                  capacity *= 2;
                  temp = realloc(words, capacity * sizeof(char *));
                  if (temp == NULL)
                    goto fail;
                  words = temp;
                }
              word = malloc((word_length + 1) * sizeof(char));
              if (word == NULL)
                goto fail;
              strncpy(word, sentence + word_start, word_length);
              word[word_length] = '\0';
              words[word_count++] = word;
              words[word_count] = NULL;
              word_start = -1;
            }
        }
      else
        {
          if (word_start < 0)
            {
              /* We have found the begin of a new word. */
              word_start = i;
            }
        }
    }
  while (sentence[i++]);
  /* Trim the array to the exact size needed. */
  temp = realloc(words, (word_count + 1) * sizeof(char *));
  if (temp == NULL)
    goto fail;
  words = temp;
  return words;
 fail:
  free_words(words);
  return NULL;
}

The logic to actually find the word boundaries is rather simple but I won't explain it here, since that has little to nothing to do with the question, which is about memory management.

Whenever we find the end of a new word, we check whether the array is still large enough to hold it and grow it if needed. After making sure the array is big enough, we allocate just enough memory to hold the next word, copy over the data and insert it into the array. We also need to take care to terminate the character string with a NUL byte and the array of words with a NULL pointer.

We are keeping track of the array's current capacity in the variable capacity and the number of words inserted so far in the variable word_count . When we decide to grow the array, we use the realloc function which tries to adjust the amount of space reserved for that pointer and – if that is not possible – allocates new space, copies over the data, and frees the old one.

Before we finally return the array, we trim its size to what is actually needed. Whether doing so is useful, might be open to debate. I just wanted to show that you can do it.

At any point where we allocate memory, we should be prepared to handle an out-of-memory condition. The standard library's functions report this by returning NULL . If we run out of memory, we fail the operation by also returning NULL . However, we must not do so without first releasing any memory allocated so far. You might be offended by my use of goto s for this purpose but it turns out that this is a rather common and generally accepted use of this language feature. (Used like this, it merely mimics the functionality exceptions would give us, if only C had them.)

See the man pages of malloc , realloc and free for their precise semantics. This is a rather serious advice: I have just seen too much code that misuses them, especially in corner cases.

To finish the example, here is how our function may be used:

int
main()
{
  const char sentence[] = "The quick brown fox jumps over the sleazy dog.";
  char * * words = tokenize(sentence);
  size_t i;
  if (words == NULL)
    return EXIT_FAILURE;
  for (i = 0; words[i] != NULL; ++i)
    {
      printf("words[%ld] = '%s'\n", i, words[i]);
    }
  free_words(words);
  return EXIT_SUCCESS;
}

One final remark: In this example, it would have been even more efficient to store all words in the same array and separate them by NUL bytes and use a second array with pointers to the beginning of each word. This would have used only two heap-allocated arrays and place the data more closely together in memory which makes access more efficient. As an exercise, you can try to modify the above example accordingly. (Hint: You'll need realloc a lot.) In general, try keeping the number of heap-allocated pointers as small as possible for both, performance and maintainability reasons. Definitely don't allocate single bytes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM