简体   繁体   English

对结构中的字符串使用 char* 或 char 数组

[英]Using a char* or a char array for strings within a struct

If I have the following pseudocode for a struct I would like to implement in C (for a text editor):如果我有一个结构的以下伪代码,我想在 C 中实现(对于文本编辑器):

STRUCT line
    STRING line_contents
    INT line_length
END

Is there a best practice for how I write my strings?是否有关于如何编写字符串的最佳实践? I see two options:我看到两个选项:

struct line {
    char* line_contents;
    size_t line_length;
};

or...要么...

#define MAX_LINE_LENGTH 1024 // Some arbitrary number

struct line {
    char line_contents[MAX_LINE_LENGTH];
    size_t line_length;
};

The first one has the drawback of leaving the programmer with manual memory management, however, this is likely to be the case anyway with structs if they're part of a linked list/some other advanced data structure.第一个的缺点是让程序员进行手动内存管理,但是,如果结构是链表/其他一些高级数据结构的一部分,那么结构很可能就是这种情况。

The second one might use too much or too little memory, it opens itself up to out of bounds errors, etc.第二个可能使用太多或太少的内存,它会导致越界错误等。

Is the way you deal with strings dependent on your use case or is there a universal best practice?您处理字符串的方式取决于您的用例还是有通用的最佳实践?

There is no generic best practice on this one.对此没有通用的最佳实践。 It is mainly wasted space vs. complexity of the code.它主要是浪费的空间与代码的复杂性。

As a rule of thumb, you might consider the typical line lengths you have in your typical documents - In a text editor, 1 vs. maybe 100 bytes, so a maximum "waste" of 99 bytes per line, which, in my opinion is acceptable on modern, non-memory-restricted machines.根据经验,您可能会考虑典型文档中的典型行长度 - 在文本编辑器中,1 对 100 字节,因此每行最大“浪费”为 99 字节,在我看来是在现代的、非内存限制的机器上可以接受。 The point is: Once your user wants a line of 101 characters, you're forced to either tell your users about the limit, or introduce expensive work-arounds for the case of extra long lines (and, revert back to complexity).关键是:一旦您的用户想要一行 101 个字符,您就不得不告诉您的用户有关限制,或者为超长行的情况引入昂贵的解决方法(并且恢复到复杂性)。

You might, however, want to consider that line-oriented editor buffers have been widely out of fashion since at least 30 years.但是,您可能要考虑到,面向行的编辑器缓冲区至少已经过时了 30 年。 The most-used (and accepted, IMHO) buffer architecture is the one Emacs introduced like 30 years ago - A big chunk of memory with an insertion gap that is moved back and forth to the place the user is editing.最常用(并接受,恕我直言)的缓冲区架构是 30 年前引入的 Emacs - 一大块内存,带有插入间隙,可前后移动到用户正在编辑的位置。

Is the way you deal with strings dependent on your use case or is there a universal best practice?您处理字符串的方式取决于您的用例还是有通用的最佳实践?

There is no "universial best" pratice.没有“普遍最佳”的做法。 It always depend on your your specific use case.它始终取决于您的特定用例。

But... your use case is a text editor so using a struct with a fixed maximum line length just seems wrong to me.但是......你的用例是一个文本编辑器,所以使用具有固定最大行长度的结构对我来说似乎是错误的。

But I like to show a third way which uses a flexible array member:但我想展示使用灵活数组成员的第三种方式:

struct line {
    size_t line_length;
    char line_contents[];  <--- Flexible array.
                                malloc determines the size
};

int main() 
{
    char* str = "Hello World";
    size_t len = strlen(str);
    struct line* l = malloc(sizeof *l + len + 1);
                                        \-----/
                                         sizeof the array

    l->line_length = len;
    strcpy(l->line_contents, str);
    printf("Len %zu: %s\n", l->line_length, l->line_contents);
    free(l);
   return 0;
}

In this way a single malloc can allocate both a new node and memory for the string.通过这种方式,单个malloc可以为字符串分配新节点和内存。

The solution that is being commonly used in C libraries is by using the internal string.h library. C 库中常用的解决方案是使用内部 string.h 库。

Strings in C are being designed with a null terminator in the end that basically says where the string ends. C 中的字符串被设计为在末尾带有一个空终止符,它基本上表示字符串的结束位置。 The null terminator is a '\\0' character.空终止符是一个 '\\0' 字符。 The C string scheme is shown here . 此处显示C 字符串方案。

Your code can be reformatted to the following bellow.您的代码可以重新格式化为以下内容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

const int MAX_STRING_LENGTH = 256;

int main()
{
   /*+1 for the additional null terminator*/
   char* line = malloc(MAX_STRING_LENGTH + 1);
   strcpy(line, "Hello stack overflow!");

   /*'line' can hold up to 256 byte long strings, but to get the 
   length of "Hello stack overflow!" string that was transfered   
   into 'line' variable, you can use 'strlen()' function as shown 
   below.*/
   printf("Length of 'line' string: %d\n", strlen(line));

   return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM