简体   繁体   English

如何在结构中放入大小可变的char数组?

[英]How do I fit a variable sized char array in a struct?

I don't understand how the reallocation of memory for a struct allows me to insert a larger char array into my struct. 我不了解如何为结构重新分配内存,从而无法在结构中插入更大的char数组。

Struct definition: 结构定义:

typedef struct props
{
    char northTexture[1];
    char southTexture[1];
    char eastTexture[1];
    char westTexture[1];
    char floorTexture[1];
    char ceilingTexture[1];
} PROPDATA;

example: 例:

void function SetNorthTexture( PROPDATA* propData, char* northTexture )
{
    if( strlen( northTexture ) != strlen( propData->northTexture ) )
    {
        PROPDATA* propPtr = (PROPDATA*)realloc( propData, sizeof( PROPDATA ) +
            sizeof( northTexture ) );
        if( propPtr != NULL )
        {
            strcpy( propData->northTexture, northTexture );
        } 
    }
    else
    {
        strcpy( propData->northTexture, northTexture );
    }
}

I have tested something similar to this and it appears to work, I just don't understand how it does work. 我已经测试过类似的东西,它似乎可以工作,但我只是不知道它是如何工作的。 Now I expect some people are thinking "just use a char*" but I can't for whatever reason. 现在,我希望有人在考虑“只使用char *”,但出于任何原因我都不能。 The string has to be stored in the struct itself. 字符串必须存储在结构本身中。

My confusion comes from the fact that I haven't resized my struct for any specific purpose. 我的困惑来自于我没有为任何特定目的调整结构大小的事实。 I haven't somehow indicated that I want the extra space to be allocated to the north texture char array in that example. 我没有以某种方式表明我希望在该示例中将额外的空间分配给北纹理字符数组。 I imagine the extra bit of memory I allocated is used for actually storing the string, and somehow when I call strcpy, it realises there is not enough space... 我想象我分配的额外内存用于实际存储字符串,当我调用strcpy时,它以某种方式意识到没有足够的空间...

Any explanations on how this works (or how this is flawed even) would be great. 关于它是如何工作的(甚至甚至是有缺陷的)任何解释都很好。

Is this C or C++? 这是C还是C ++? The code you've posted is C, but if it's actually C++ (as the tag implies) then use std::string . 您发布的代码是C,但是如果它实际上是C ++(如代码所示),则使用std::string If it's C, then there are two options. 如果是C,则有两个选择。

If (as you say) you must store the strings in the structure itself, then you can't resize them. 如果(如您所说)必须将字符串存储在结构本身中,则无法调整它们的大小。 C structures simply don't allow that. C结构根本不允许这样做。 That "array of size 1" trick is sometimes used to bolt a single variable-length field onto the end of a structure, but can't be used anywhere else because each field has a fixed offset within the structure. 有时会使用“大小为1的数组”的技巧将单个可变长度字段连接到结构的末端,但由于在结构内部每个字段都有固定的偏移量,因此无法在其他任何地方使用。 The best you can do is decide on a maximum size, and make each an array of that size. 您能做的最好的事情就是确定最大大小,并将每个大小做成该大小的数组。

Otherwise, store each string as a char* , and resize with realloc . 否则,将每个字符串存储为char* ,并使用realloc调整大小。

strcpy is not that intelligent, and it is not really working. strcpy并不是那么聪明,它也不是很有效。

The call to realloc() allocates enough space for the string - so it doesn't actually crash but when you strcpy the string to propData->northTexture you may be overwriting anything following northTexture in propData - propData->southTexture, propData->westTexture etc. 调用realloc()会为字符串分配足够的空间-因此它实际上不会崩溃,但是当您将字符串strctpy到propData-> northTexture时,您可能会覆盖propData中位于northTexture之后的任何内容-propData-> southTexture,propData-> westTexture等等

For example is you called SetNorthTexture(prop, "texture"); 例如,您称为SetNorthTexture(prop, "texture"); and printed out the different textures then you would probably find that: 并打印出不同的纹理,那么您可能会发现:

 northTexture is "texture"
 southTexture is "exture"
 eastTexture is "xture" etc (assuming that the arrays are byte aligned). 

Assuming you don't want to statically allocate char arrays big enough to hold the largest strings, and if you absolutely must have the strings in the structure then you can store the strings one after the other at the end of the structure. 假设您不希望静态分配足够大的char数组以容纳最大的字符串,并且如果您绝对必须在结构中包含这些字符串,则可以在结构的末尾一个接一个地存储这些字符串。 Obviously you will need to dynamically malloc your structure to have enough space to hold all the strings + offsets to their locations. 显然,您将需要动态地分配结构,使其具有足够的空间来容纳所有字符串及其位置的偏移量。

This is very messy and inefficient as you need to shuffle things around if strings are added, deleted or changed. 这非常混乱且效率低下,因为如果添加,删除或更改字符串,则需要重新整理一下内容。

My confusion comes from the fact that I haven't resized my struct for any specific purpose. 我的困惑来自于我没有为任何特定目的调整结构大小的事实。

In low level languages like C there is some kind of distinction between structs (or types in general) and actual memory. 在像C这样的低级语言中,结构(或一般类型)与实际内存之间存在某种区别。 Allocation basically consists of two steps: 分配基本上包括两个步骤:

  1. Allocation of raw memory buffer of right size 适当大小的原始内存缓冲区的分配
  2. Telling the compiler that this piece of raw bytes should be treated as a structure 告诉编译器这部分原始字节应视为结构

When you do realloc, you do not change the structure, but you change the buffer it is stored in, so you can use extra space beyond structure. 重新分配时,您无需更改结构,但可以更改存储在其中的缓冲区,因此可以在结构之外使用额外的空间。

Note that, although your program will not crash, it's not correct. 请注意,尽管您的程序不会崩溃,但这是不正确的。 When you put text into northTexture, you will overwrite other structure fields. 将文本放入northTexture中时,将覆盖其他结构字段。

This answer is not to promote the practice described below, but to explain things. 这个答案不是要促进下面描述的做法,而是要说明一些事情。 There are good reasens not to use malloc and suggestions to use std::string, in other answers, are valid. 在其他答案中,最好不要使用malloc并建议使用std :: string,这是有效的。

I think You have come across the trick used for example by Microsoft to avid the cost of a pointer dereference. 我认为您已经碰到了Microsoft例如使用的技巧来增加指针取消引用的代价。 In the case of Unsized Arrays in Structures (please check the link) it relies on a non-standard extension to the language. 如果是“ 结构中的数组不大小” (请检查链接),则它依赖于该语言的非标准扩展。 You can use a trick like that, even without the extension, but only for the struct member, that is positioned at it's end in the memory. 您可以使用这样的技巧,即使没有扩展名,也只能用于位于内存末端的struct成员。 Usually the last member in the structure declaration is also the last, in the memory, but check this question to know more about it. 通常,结构声明中的最后一个成员也是内存中的最后一个成员,但是请检查此问题以了解更多信息。 For the trick to work, You also have to make sure, the compiler won't add padding bytes at the end of the structure. 为了使技巧起作用,还必须确保编译器不会在结构的末尾添加填充字节。

The general idea is like this: Suppose You have a structure with an array at the end like 一般的想法是这样的:假设您有一个结构,该结构的末尾有一个数组,例如

struct MyStruct
{
    int someIntField;
    char someStr[1];
};

When allocating on the heap, You would normally say something like this 在堆上分配时,通常会说这样的话

MyStruct* msp = (MyStruct*)malloc(sizeof(MyStruct));

However, if You allocate more space, than Your stuct actually occupies, You can reference the bytes, that are laid out in the memory, right behind the struct with "out of bounds" access to the array elements. 但是,如果您分配的空间多于结构实际占用的空间,则可以引用在内存中布置的字节,就在结构后面,并且可以“超出范围”访问数组元素。 Assuming some typical sizes for the int and the char, and lack of padding bytes at the end , if You write this: 假设int和char具有一些典型的大小,并且末尾没有填充字节 ,如果您这样编写:

MyStruct* msp = (MyStruct*)malloc(sizeof(MyStruct) + someMoreBytes);

The memory layout should look like: 内存布局应如下所示:

|    msp   |   msp+1  |   msp+2  |   msp+3  |   msp+4  |   msp+5  |   msp+6  | ... |
|    <-         someIntField         ->     |someStr[0]|  <-   someMoreBytes  ->   |

In that case, You can reference the byte at the address msp+6 like this: 在这种情况下,您可以像这样在地址msp+6处引用该字节:

msp->someStr[2];

NOTE: This has no char array example but it is the same principle. 注意:这没有字符数组示例,但原理相同。 It is just a guess of mine of what are you trying to achieve. 这只是我对您要实现的目标的猜测。

My opinion is that you have seen somewhere something like this : 我的意见是,您在某处看到了以下内容

typedef struct tagBITMAPINFO {
  BITMAPINFOHEADER bmiHeader;
  RGBQUAD          bmiColors[1];
} BITMAPINFO, *PBITMAPINFO;

What you are trying to obtain can happen only when the array is at the end of the struct (and only one array). 仅当数组位于结构的末尾(并且只有一个数组)时,您尝试获取的内容才会发生。

For example you allocate sizeof(BITMAPINFO)+15*sizeof(GBQUAD) when you need to store 16 RGBQUAD structures (1 from the structure and 15 extra). 例如,当您需要存储16个RGBQUAD结构(结构中的1个和15个额外的)时,分配sizeof(BITMAPINFO)+15*sizeof(GBQUAD) )。

PBITMAPINFO info = (PBITMAPINFO)malloc(sizeof(BITMAPINFO)+15*sizeof(GBQUAD));

You can access all the RGBQUAD structures like they are inside the BITMAPINFO structure: 您可以访问所有RGBQUAD结构,就像它们在BITMAPINFO结构中一样:

info->bmiColors[0]
info->bmiColors[1]
...
info->bmiColors[15]

You can do something similar to an array declared as char bufStr[1] at the end of a struct. 您可以执行类似于在结构末尾声明为char bufStr[1]的数组的操作。

Hope it helps. 希望能帮助到你。

One approach to keeping a struct and all its strings together in a single allocated memory block is something like this: 将结构及其所有字符串保存在一个分配的内存块中的一种方法是这样的:

struct foo {
    ptrdiff_t s1, s2, s3, s4;
    size_t bufsize;
    char buf[1];
} bar;

Allocate sizeof(struct foo)+total_string_size bytes and store the offsets to each string in the s1 , s2 , etc. members and bar.buf+bar.s1 is then a pointer to the first string, bar.buf+bar.s2 a pointer to the second string, etc. 分配sizeof(struct foo)+total_string_size个字节,并将偏移量存储到s1s2等成员中的每个字符串bar.buf+bar.s1然后是指向第一个字符串bar.buf+bar.s2 a的指针指向第二个字符串的指针,等等。

You can use pointers rather than offsets if you know you won't need to realloc the struct. 如果你知道你将不再需要您可以使用指针,而不是补偿realloc的结构。

Whether it makes sense to do something like this at all is debatable. 根本不做这样的事情是有争议的。 One benefit is that it may help fight memory fragmentation or malloc/free overhead when you have a huge number of tiny data objects (especially in threaded environments). 好处之一是,当您有大量微型数据对象时(尤其是在线程环境中),它可能有助于解决内存碎片或malloc /空闲开销的问题。 It also reduces error handling cleanup complexity if you have a single malloc failure to check for. 如果您要检查单个malloc失败,它还可以降低错误处理清除的复杂性。 There may be cache benefits to ensuring data locality. 在确保数据局部性方面可能会有缓存优势。 And it's possible (if you use offsets rather than pointers) to store the object on disk without any serialization (keeping in mind that your files are then machine/compiler-specific). 而且有可能(如果使用偏移量而不是指针)将对象存储在磁盘上而无需任何序列化(请记住,文件是特定于计算机/编译器的)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM