简体   繁体   English

在内存中顺序写入不同数据类型的值? 或者,具有多种数据类型的数组?

[英]Write values of different data types in sequence in memory? or, Array with multiple data types?

I am relatively new to writing in C. I have self taught myself using what resources I have found online and in print. 我对使用C语言写书比较陌生。我已经使用在网上和印刷版中发现的资源自学了自己。 This is my first real project in C programming. 这是我在C编程中的第一个实际项目。 Gotta love on-the-job training. 一定喜欢在职培训。

I am writing some code in C that is being used on a Texas Instruments C6701 Digital Signal Processor. 我正在用C编写一些在德州仪器C6701数字信号处理器上使用的代码。 Specifically, I am writing a set of communication functions to interface through a serial port. 具体来说,我正在编写一组通信功能以通过串行端口进行接口。

The project I'm on has an existing packet protocol for sending data through the serial port. 我参与的项目已有一个用于通过串行端口发送数据的数据包协议。 This works by handing over a pointer to the data to be transmitted and its length in bytes. 这是通过将指针移交给要传输的数据及其长度(以字节为单位)来进行的。 All I have to do is write in the bytes to be transmitted into an "array" in memory (the transmitter copies that sequence of bytes into a buffer and transmits that). 我要做的就是将要写入的字节写入内存中的“数组”中(发送器将字节序列复制到缓冲区中并进行发送)。

My question pertains to how best to format the data to be transmitted, the data I have to send is composed of several different data types (unsigned char, unsigned int, float etc...). 我的问题与如何最好地格式化要传输的数据有关,我必须发送的数据由几种不同的数据类型(无符号字符,无符号整数,浮点数等)组成。 I can't expand everything up to float (or int) because I have a constrained communication bandwidth and need to keep packets as small as possible. 我无法将所有内容扩展到float(或int),因为我的通信带宽受到限制,并且需要使数据包尽可能小。

I originally wanted to use arrays to format the data, 我本来想用数组格式化数据,

unsigned char* dataTx[10];
dataTx[0]=char1;
dataTx[1]=char2;
etc...

This would work except not all my data is char, some is unsigned int or unsigned short. 除非我的所有数据都不是char,有些不是unsigned int或unsigned short,这将起作用。

To handle short and int I used bit shifting (lets ignore little-endian vs big-endian for now). 为了处理short和int,我使用了移位(现在让我们忽略little-endian与big-endian)。

unsigned char* dataTx[10];
dataTx[0]=short1>>8;
dataTx[1]=short1;
dataTx[2]=int1>>24;
dataTx[3]=int1>>16;
etc...

However, I believe another (and better?) way to do this is to use pointers and pointer arithmetic. 但是,我相信另一种(更好的方法)是使用指针和指针算术。

unsigned char* dataTx[10]
*(dataTx+0) = int1;
*(dataTx+4) = short1;
*(dataTx+6) = char1;
etc...

My question ( finally ) is, is which method (bit shifting or pointer arithmetic) is the more acceptable method? 我的问题最后 )是,哪种方法(移位或指针算术)是更可接受的方法? Also, is one faster to run? 另外,运行起来更快吗? (I also have run-time constraints). (我也有运行时约​​束)。

My requirement: The data be located in memory serially, without gaps, breaks or padding. 我的要求:数据要顺序存储在内存中,没有间隙,中断或填充。

I don't know enough about structures yet to know if a structure would work as a solution. 我对结构还不了解,还不知道结构是否可以作为解决方案。 Specifically, I don't know if a structure always allocates memory locations serially and without breaks. 具体来说,我不知道结构是否总是按顺序分配内存位置并且没有中断。 I read something that indicates they allocates in 8 byte blocks, and possibly introduce padding bytes. 我读到一些东西,表明它们分配了8个字节的块,并可能引入了填充字节。

Right now I'm leaning towards the pointer method. 现在,我倾向于使用指针方法。 Thanks for reading this far into what seems to be a long post. 感谢您阅读这篇长篇文章。

您可能要使用联合数组。

The easiest and most traditional way to handle your problem is to set up the data you want to send, and then pass a pointer to your data on to the transmission routine. 解决问题的最简单,最传统的方法是设置要发送的数据,然后将指向数据的指针传递给传输例程。 The most common example would be the POSIX send() routine: 最常见的示例是POSIX send()例程:

ssize_t send(int socket, const void *buffer, size_t length, int flags);

Which for your case you can simplify to: 对于您的情况,可以简化为:

ssize_t send(const void *buffer, size_t length);

And then use something like: 然后使用类似:

send(&int1, sizeof int1);
send(&short1, sizeof short1);

To send it out. 发送出去。 An example (but pretty naive) implementation for your situation might be: 针对您的情况的示例(但很幼稚)实现可能是:

ssize_t send(const void *buffer, size_t length)
{
  size_t i;
  unsigned char *data = buffer;

  for (i = 0; i < length; i++)
  {
     dataTx[i] = data[i];
  }
}

In other words, use the automatic conversion to void * and then back to char * to get byte-wise access to your data, and then send it out appropriately. 换句话说,使用自动转换为void * ,然后返回到char *以按字节方式访问您的数据,然后将其适当地发送出去。

Long question, I'll try shorter answer. 长的问题,我将尝试较短的答案。

Don't go on *(dataTx+4) = short1; 不要继续*(dataTx + 4)= short1; etc. because this method may fail because most chips may do read/write only on some aligned positions. 等等,因为这种方法可能会失败,因为大多数芯片可能只在某些对齐的位置上进行读/写操作。 You can access by 16bit to positions aligned by 2, and 32bit on positions aligned by 4, but take an example of: "int32 char8 int32" - the second int32 have a position of (dataTx+5) - which is not 4-byte aligned, and you probably get the "bus error" or something like that (depending of CPU you'll use). 您可以按16位访问以2对齐的位置,而以32位访问以4对齐的位置,但是举一个例子:“ int32 char8 int32”-第二个int32的位置为(dataTx + 5)-不是4字节对齐,您可能会收到“总线错误”或类似的信息(取决于您将使用的CPU)。 Hope you understand this issue. 希望你理解这个问题。

1st way - you can try struct, if you declare: 第一种方法-如果您声明:

struct
{
    char a;
    int b;
    char c;
    short d;
};

you are now out-of-trouble, as the compilator itself would take care about struct alignment. 您现在不再麻烦了,因为编译器本身会注意结构对齐。 Of course, read about alignment-related options in your compiler (if this is gcc, then this is simply called alignment), because there is probably a setting which force some alignment of struct fields or packing of struct fields. 当然,请阅读有关编译器中与对齐方式相关的选项的信息(如果是gcc,则简称为对齐方式),因为可能存在一个设置,该设置会强制某些结构域进行对齐或对结构域进行打包。 The GCC can even define alignment-per-struct (more here ). GCC甚至可以定义每个结构的对齐方式(更多内容请参见 )。

The other way is to use some "buffer-like approach" - something like in answer-post of Carl Norum (I won't be duplicating that answer), but also considering of use of memcpy() calls when more data is copied (eg long long or string), as this may be faster than copying byte-by-byte. 另一种方法是使用一些“类似缓冲区的方法”-类似于卡尔·诺鲁姆的回答(我不会重复该回答),但是还考虑了在复制更多数据时使用memcpy()调用(例如long long或string),因为这可能比逐字节复制要快。

Usually you would use the bit shifting approach, because many chips do not allow you to copy, for example, a 4-byte integer to an odd byte address (or, more accurately, to a set of 4 bytes starting at an odd byte address). 通常,您将使用移位方法,因为许多芯片不允许您将4字节整数复制到奇数字节地址(或更准确地说,是从奇数字节地址开始的4字节集合中) )。 This is called alignment. 这称为对齐。 If portability is an issue, or if your DSP does not allow misaligned access, then shifting is necessary. 如果可移植性是一个问题,或者您的DSP不允许未对齐的访问,则需要进行移位。 If your DSP incurs a significant performance hit for misaligned access, you might worry about it. 如果您的DSP因未对齐访问而导致严重的性能下降,则您可能会为此担心。

However, I would not write the code with the shifts for the different types done longhand as shown. 但是,我不会写出带有代码的代码来对不同类型的代码进行长期修改,如图所示。 I would expect to use functions (possibly inline) or macros to handle both the serialization and deserialization of the data. 我希望使用函数(可能是内联函数)或宏来处理数据的序列化和反序列化。 For example: 例如:

unsigned char dataTx[1024];
unsigned char *dst = dataTx;

dst += st_int2(short1, dst);
dst += st_int4(int1, dst);
dst += st_char(str, len, dst);
...

In function form, these functions might be: 以函数形式,这些函数可能是:

size_t st_int2(uint16_t value, unsigned char *dst)
{
    *dst++ = (value >> 8) & 0xFF;
    *dst   = value & 0xFF;
    return 2;
}

size_t st_int4(uint32_t value, unsigned char *dst)
{
    *dst++ = (value >> 24) & 0xFF;
    *dst++ = (value >> 16) & 0xFF;
    *dst++ = (value >>  8) & 0xFF;
    *dst   = value & 0xFF;
    return 4;
}

size_t st_char(unsigned char *str, size_t len, unsigned char *dst)
{
    memmove(dst, str, len);
    return len;
}

Granted, such functions make the code boring; 当然,这些功能使代码变得无聊。 on the other hand, they reduce the chance for mistakes too. 另一方面,它们也减少了出错的机会。 You can decide whether the names should be st_uint2() instead of st_int2() -- and, indeed, you can decide whether the lengths should be in bytes (as here) or in bits (as in the parameter types). 您可以决定名称应为st_uint2()而不是st_int2() ,并且实际上,您可以决定长度应为字节(如此处)还是位(如参数类型)。 As long as you're consistent and boring, you can do as you will. 只要您始终如一且无聊,就可以做。 You can also combine these functions into bigger ones that package entire data structures. 您也可以将这些功能组合成更大的功能,以打包整个数据结构。

The masking operations ( & 0xFF ) may not be necessary with modern compilers. 对于现代编译器,可能不需要屏蔽操作( & 0xFF )。 Once upon a very long time ago, I seem to remember that they were necessary to avoid occasional problems with some compilers on some platforms (so, I have code dating back to the 1980s that include such masking operations). 很久很久以前,我似乎记得它们对于避免某些平台上的某些编译器偶尔出现问题是必要的(因此,我的代码可以追溯到1980年代,其中包括这种屏蔽操作)。 Said platforms have probably gone to rest in peace, so it may be pure paranoia on my part that they're (still) there. 所说的平台可能已经安息了,所以就我而言,它们仍然在那里仍然是纯粹的偏执狂。

Note that these functions are passing the data in big-endian order. 请注意,这些函数以big-endian顺序传递数据。 The functions can be used 'as is' on both big-endian and little-endian machines, and the data will be interpreted correctly on both types, so you can have diverse hardware talking over the wire, using this code, and there will be no miscommunication. 这些函数可以在大端和小端机器上“按原样”使用,并且两种类型的数据都可以正确解释,因此,使用此代码,您可以使各种硬件通过网络进行通信,并且没有沟通不畅。 If you have floating point values to convey, you have to worry a bit more about the representations over the wire. 如果您要传递浮点值,则必须多担心导线上的表示形式。 Nevertheless, you should probably aim to have the data transferred in a platform-neutral format so that interworking between chip types is as simple as possible. 尽管如此,您可能应该以平台无关的格式传输数据,以使芯​​片类型之间的互操作尽可能简单。 (This is also why I used the type sizes with numbers in them; 'int' and 'long' in particular can mean different things on different platforms, but 4-byte signed integer remains a 4-byte signed integer, even if you are unlucky - or lucky - enough to have a machine with 8-byte integers.) (这也是为什么我在类型大小中使用数字的原因;特别是'int'和'long'在不同的平台上可能意味着不同的意思,但是4字节有符号整数仍然是4字节有符号整数,即使您使用不幸-或幸运-足以拥有一台8字节整数的机器。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM