简体   繁体   中英

Using a char* or a char array for strings within a struct

If I have the following pseudocode for a struct I would like to implement in C (for a text editor):

STRUCT line
    STRING line_contents
    INT line_length
END

Is there a best practice for how I write my strings? I see two options:

struct line {
    char* line_contents;
    size_t line_length;
};

or...

#define MAX_LINE_LENGTH 1024 // Some arbitrary number

struct line {
    char line_contents[MAX_LINE_LENGTH];
    size_t line_length;
};

The first one has the drawback of leaving the programmer with manual memory management, however, this is likely to be the case anyway with structs if they're part of a linked list/some other advanced data structure.

The second one might use too much or too little memory, it opens itself up to out of bounds errors, etc.

Is the way you deal with strings dependent on your use case or is there a universal best practice?

There is no generic best practice on this one. It is mainly wasted space vs. complexity of the code.

As a rule of thumb, you might consider the typical line lengths you have in your typical documents - In a text editor, 1 vs. maybe 100 bytes, so a maximum "waste" of 99 bytes per line, which, in my opinion is acceptable on modern, non-memory-restricted machines. The point is: Once your user wants a line of 101 characters, you're forced to either tell your users about the limit, or introduce expensive work-arounds for the case of extra long lines (and, revert back to complexity).

You might, however, want to consider that line-oriented editor buffers have been widely out of fashion since at least 30 years. The most-used (and accepted, IMHO) buffer architecture is the one Emacs introduced like 30 years ago - A big chunk of memory with an insertion gap that is moved back and forth to the place the user is editing.

Is the way you deal with strings dependent on your use case or is there a universal best practice?

There is no "universial best" pratice. It always depend on your your specific use case.

But... your use case is a text editor so using a struct with a fixed maximum line length just seems wrong to me.

But I like to show a third way which uses a flexible array member:

struct line {
    size_t line_length;
    char line_contents[];  <--- Flexible array.
                                malloc determines the size
};

int main() 
{
    char* str = "Hello World";
    size_t len = strlen(str);
    struct line* l = malloc(sizeof *l + len + 1);
                                        \-----/
                                         sizeof the array

    l->line_length = len;
    strcpy(l->line_contents, str);
    printf("Len %zu: %s\n", l->line_length, l->line_contents);
    free(l);
   return 0;
}

In this way a single malloc can allocate both a new node and memory for the string.

The solution that is being commonly used in C libraries is by using the internal string.h library.

Strings in C are being designed with a null terminator in the end that basically says where the string ends. The null terminator is a '\\0' character. The C string scheme is shown here .

Your code can be reformatted to the following bellow.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

const int MAX_STRING_LENGTH = 256;

int main()
{
   /*+1 for the additional null terminator*/
   char* line = malloc(MAX_STRING_LENGTH + 1);
   strcpy(line, "Hello stack overflow!");

   /*'line' can hold up to 256 byte long strings, but to get the 
   length of "Hello stack overflow!" string that was transfered   
   into 'line' variable, you can use 'strlen()' function as shown 
   below.*/
   printf("Length of 'line' string: %d\n", strlen(line));

   return 0;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM