简体   繁体   English

在C中的结构中有数组时的最佳实践是什么?

[英]What's the best practice when you have an array in a struct in C?

I have a custom struct which I'm gonna use to send data over a TCP connection. 我有一个自定义结构,将用于通过TCP连接发送数据。 What would be the best way of declaring an array inside this struct ? 在此struct中声明数组的最佳方法是什么? Would it be : 可不可能是 :

typedef struct programData {
    int* dataArray;
    size_t numberofelements;
} pd;
// ...
pd data = {0};
data.dataArray = malloc(5*sizeof(int));
// put content in array ...
data.numerofelements = 5;

Or would it be this way : 还是这样呢:

typedef struct programData {
    int dataArray[5];
} pd;
// ...
pd data = {0};
data.dataArray[0] = ...;
// ...
data.dataArray[4] = ...;

I did the first way out of habit of using malloc() in C, but don't think the contents of the array would actually be passed on to the client on the other side of the connection since dataArray would actually be a pointer to a memory address inside the server's memory. 我采取了第一种方法,是出于不在C中使用malloc()的习惯,但不认为数组的内容实际上会传递给连接另一端的客户端,因为dataArray实际上是指向a的指针。服务器内存中的内存地址。 Or would send(2) actually send the contents of the array with it ? 还是send(2)实际使用它发送数组的内容?

edit : some incoherences due to copy pasting from my code 编辑:由于从我的代码复制粘贴一些不一致性

send is not a service for transmitting compound data structures, including interpreting the meanings of pointers and connected data. send不是用于传输复合数据结构的服务,包括解释指针和连接的数据的含义。 It is a service for sending raw bytes. 它是用于发送原始字节的服务。 When using send , you must transform your data into raw bytes that can be sent. 使用send ,必须将数据转换为可以发送的原始字节。 The receiver must construct their own data structures from those bytes. 接收者必须根据这些字节构造自己的数据结构。 This means you must create a scheme for representing your data using bytes. 这意味着您必须创建一个方案来使用字节表示数据。

When the raw bytes of a structure are sent to another system, and the receiving system uses those same raw bytes to represent a structure, the resulting meaning of the data may differ for reasons including: 当将结构的原始字节发送到另一个系统,并且接收系统使用这些相同的原始字节来表示结构时,由于以下原因,数据的最终含义可能会有所不同:

  • The systems represent objects (such as integers) with bytes in different orders. 系统用字节以不同顺序表示对象(例如整数)。
  • The systems insert different numbers of padding bytes in the structure to maintain alignment required or preferred by hardware. 系统在结构中插入不同数量的填充字节,以保持硬件所需或首选的对齐方式。
  • The systems use different encodings for characters or floating-point data. 系统对字符或浮点数据使用不同的编码。
  • The types on the system are different, as where one may use two bytes for int while the other uses four. 系统上的类型不同,其中一个可以将两个字节用于int而另一个可以使用四个字节。
  • Pointers on one system are meaningless on the other system, as they point to data that was never transmitted to the other system and that contain addresses that are not relevant to the address layout on the other system. 一个系统上的指针在另一个系统上毫无意义,因为它们指向的数据从未传输到另一个系统,并且包含与另一个系统上的地址布局无关的地址。

With a simple data structure, it is possible to define the protocol for transmitting raw bytes to send the actual bytes that represent the data structure. 使用简单的数据结构,可以定义用于传输原始字节以发送代表数据结构的实际字节的协议。 This is especially true if the sending and receiving systems use the same hardware and software. 如果发送和接收系统使用相同的硬件和软件,则尤其如此。 However, even in such cases, the protocol should be clearly specified: How big is each element, what data encodings are used, what order are the bytes within each element in, and so on. 但是,即使在这种情况下,也应明确规定协议:每个元素有多大,使用什么数据编码,每个元素中的字节按什么顺序等等。

Assuming you have simple data structures and use a simple protocol of sending the actual bytes that represent the data, then of course declaring an array inside the structure is the simplest. 假设您具有简单的数据结构,并使用简单的协议发送表示数据的实际字节,那么,当然,在结构内部声明一个数组是最简单的。 If the array is small or is usually nearly full, so that only a small amount of waste will occur by storing and transmitted unused data, then declaring an array inside the structure may be a fine solution. 如果数组很小或通常接近满,那么通过存储和传输未使用的数据将仅产生少量浪费,那么在结构内部声明数组可能是一个很好的解决方案。

If the amount of data needed in the array will vary more than slightly, then it is usually preferred to allocate the array dynamically, as a matter of resource efficiency. 如果阵列中所需的数据量变化不大,那么从资源效率的角度来看,通常最好动态分配阵列。 As shown in your question, the structure may contain a pointer, which is filled in with the address of the array data. 如您的问题所示,该结构可能包含一个指针,该指针填充有数组数据的地址。

When a structure contains such a pointer, you cannot send the pointer with send (without making additional efforts to provide for its interpretation). 当结构包含这样的指针,你不能发送终场前send (不进行额外的努力,为它的解释)。 Instead, you will need to use one or more send calls to send the other data in the structure, and then you will need another send call to send the data in the array. 相反,您将需要使用一个或多个send调用来发送结构中的其他数据,然后,您将需要另一个send调用来发送数组中的数据。 And, of course, your protocol for transmitting the data must include a way to communicate the number of array elements being sent. 而且,当然,您用于传输数据的协议必须包括一种方法来传达要发送的数组元素的数量。

One more option mixes both dynamic allocation of space for the array and including the array in the structure: The last element of a structure may be a flexible array member. 还有一个选项既混合了为阵列动态分配空间,又将阵列包括在结构中:结构的最后一个元素可以是灵活的阵列成员。 This is an array declared within the structure as Type dataArray[]; 这是在结构内声明为Type dataArray[];的数组Type dataArray[]; . It must be the last element of the structure. 它必须是结构的最后一个元素。 It has no intrinsic size, but, when allocating space for the structure, you would add additional space for the array. 它没有内部大小,但是在为结构分配空间时,您将为数组添加额外的空间。 In this case, instead of the structure having a pointer to an array, the array follows the base portion of the structure in memory. 在这种情况下,代替具有指向数组的指针的结构,该数组跟随内存中结构的基础部分。 Such a structure with its array could be sent in a single send call, provided the cautions above are provided for: The receiving system must be able to interpret the bytes correctly, and the size of the array must be communicated. 如果提供以上注意事项,则可以在单个send调用中发送带有数组的这种结构:接收系统必须能够正确解释字节,并且必须传达数组的大小。

Best practice is to let the requirements of your project determine which approach to use. 最佳实践是让项目的需求确定使用哪种方法。 Both have distinct advantages depending on what is needed. 两者都有明显的优势,具体取决于所需的内容。

Given your two examples: 举两个例子:

1) 1)

typedef struct programData {
int dataArray[5];//assuming '*' was a typo
} pd;

2) 2)

typedef struct programData {
    int* dataArray;
    size_t numberofelements;
} pd; 

If you know the size requirement before run-time, then Option 1), the simpler approach, is always preferred. 如果您在运行前知道大小要求,则始终首选方法1)(一种更简单的方法)。 If not, then Option 2) is needed, but has its costs. 如果不是,则需要选项2),但有成本。 Dynamic allocation of memory adds complexity to code with respect to error handling and memory management, and making sure everything that uses calloc and family is freed when done using it. 动态分配内存会增加代码在错误处理和内存管理方面的复杂性,并确保使用calloc和family的所有内容在使用完后都可以释放。

Serialization, and de-serialization is recommended to transmit either form. 建议使用序列化和反序列化来传输这两种形式。 (and required for option 2 as pointers are used.) The extra rigor to implement pays dividends in terms of increased predictability of exactly what is being sent. (并且使用了选项2作为指针所必需的。)实施的额外严格性带来了好处,即增加了所发送内容的可预测性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM