简体   繁体   中英

Pointer array to (with pointer array defined) strings: Are the strings stored sequential in memory?

I wondering about how strings are stored in memory, when define them with/through an array of pointers, which point to them.

For example:

char *pa[] = { "Hello World!", "foo","bar","huhu","Let´s talk about that" };

Are the strings (or better: their characters) stored sequential in memory, one after another?

Like, for example in this case:

The first character byte of the second string "foo" which is f is stored directly inside the byte after the \0 -Null character of the first string "Hello World!" .

OR

Are the strings stored separated in memory?, like for example:

\0 -Null character of the first string "Hello World!" - sequence of Bytes between - f character of second string "foo" ?

OR

Is it even so, that the storage is dependent from the situation, compiler, platform, etc. and its one time directly-sequential and one time not?

Can it furthermore occure also, that fe the first character f of the second string "foo" is stored directly after the \0 -character of the first string "Hello World!" , meaning they are stored sequential, and between the \0 -character of the second string "foo" and the first character of the third string "bar" , which is b is a gap of to the string-group non-affiliated bytes, dependent from the compiler, platform, etc.?

Question is for C and C++, as i work with both. If the answers between those two alter, please mention which language is in focus.

Hope you can understand, what i mean. Thank you very much for any answer.

No, you cannot assume anything. It is implementation-defined whether they are stored in contiguous memory or not.

If you really want the strings to be like that, try

const char *base = "hello\0foo\0bar";
const char *hello = base;
const char *foo = base + 6; // hello + strlen(hello) + 1
const char *bar = base + 10; // foo + strlen(foo) + 1

or, as @SteveSummit suggests

const char *pa[] = { base, base + 6, base + 10 };

Furthermore, if you had

char *pa[] = { "testing", "testing", "more testing" };

it would be possible for the compiler to store just one copy of the string "testing" , and point to it from both pa[0] and pa[1] . (In fact, I just tried it with two modern compilers, and both of them did exactly that.)

Theoretically it would be possible for a really clever compiler to store just the string "more testing" and have pa[0] and pa[1] point into the middle of it.

I assume you were asking out of curiosity, but if by any chance you were thinking of writing code that somehow depended on the ordering of string constants in memory, the immediate and simple answer is: Don't .

What Steve Summit answered, plus: If multiple strings are stored, they could be in any order, or far apart from each other.

In addition, comparing pointers to these strings using ">", ">=" etc. is undefined behaviour. So you may check for example if p1 = "testing", p2 = "testing", whether p2 == p1 + 8 (which will produce 0 or 1 without any guarantees), but not whether p2 >= p1 + 8.

As others mentioned, the memory layout is implementation defined.

Extending pmg 's approach and doing C you could do it like this:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

char ** create_pointer_array_pointing_to_sequential_data(char ** ppa)
{
  char ** result = NULL;

  if (NULL == ppa)
  {
    errno = EINVAL;
  }
  else
  {
    size_t s = 0;
    size_t l = 0;

    while (NULL != ppa[l])
    {
      s += strlen(ppa[l]);
      ++l;
    }

    result = malloc((l + 1) * sizeof *result);
    if (NULL != result)
    {
      result[0] = malloc(s + l + 1);
      if (NULL != result[0])
      {
        for (size_t i = 0; i < l; ++i)
        {
          strcpy(result[i], ppa[i]);
          result[i + 1] = result[i] + strlen(result[i]) + 1;
        }

        result[l] = NULL;
      }
      else
      {
        int errno_save = errno;
        free(result);
        errno = errno_save;
        result = NULL;
      }
    }
  }

  return result;
}

Use it like:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

char ** create_pointer_array_pointing_to_sequential_data(char ** ppa);

int main(void)
{
  char ** pa = create_pointer_array_pointing_to_sequential_data(
    (char*[]){"Hello World!",
      "foo",
      "bar",
      "huhu",
      "Let's talk about that",
      NULL}
    );

   if (NULL == pa)
   {
     perror("create_pointer_array_pointing_to_sequential_data() failed");
     exit(EXIT_FAILURE);
   }

   for (size_t i = 0; NULL != pa[i]; ++i)
   {
     printf("pa[%zu] starts at %p and ends at %p: %s\n", 
       i, (void*) pa[i], (void*)(pa[i] + strlen(pa[i])), pa[i]);
   }
 }

And get:

pa[0] starts at 0x6000003f0 and ends at 0x6000003fc: Hello World!
pa[1] starts at 0x6000003fd and ends at 0x600000400: foo
pa[2] starts at 0x600000401 and ends at 0x600000404: bar
pa[3] starts at 0x600000405 and ends at 0x600000409: huhu
pa[4] starts at 0x60000040a and ends at 0x600000420: Let's talk about that

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM