简体   繁体   English

我可以将整个数组从* void重铸为* int吗?

[英]Can I re-cast an entire array from *void to *int?

I have a function which reads a binary file into memory as type void *. 我有一个函数,将一个二进制文件读入内存,类型为void *。 Information in the file header indicates the amount of memory required and the actual data type (in bytes per number - eg. 8 if it should be interpreted as "long". 文件头中的信息指示所需的内存量和实际数据类型(以字节为单位,每个数字-例如8,如果应将其解释为“长”)。

My problem is, main has no knowledge of the data type or memory required. 我的问题是,main不了解所需的数据类型或内存。 So I call the function like this: 所以我这样调用函数:

long myfread(char *infile, void **tempdata,*datasize) 

char *infile="data.bin"; // name of the input file
void *tempdata=NULL; // where the data will be stored, initially 
long n; // total numbers read, returned by the function 
size_t datasize; // modified appropriately by the function 

n = myfread(infile,&tempdata,&datasize);

So far so good - main can read the bytes in "tempdata" - but not as (say) integers or floats. 到目前为止,一切都很顺利-main可以读取“ tempdata”中的字节-但不能读取(例如)整数或浮点数。 My question is, is there a simple way to recast tempdata to make this possible? 我的问题是,是否有一种简单的方法可以重铸tempdata以使之成为可能?

I think that you are not talking about array, but a block of memory. 我认为您不是在谈论数组,而是一块内存。

A pointer, no matter it's void * , char * or int * ; 指针,无论它是void *char *还是int * when it pointed to an address of memory(may be virtual, mostly on the heap), the difference is only how it is interpreted. 当它指向内存地址(可能是虚拟的,主要在堆上)时,区别仅在于其解释方式。

Say you have 16 bytes of memory block, for byte[] you got 16, for int[] (per 32 bits) your got 4, and so on. 假设您有16个字节的内存块,对于byte[] ,您有16个字节;对于int[] (每32位),您有4个字节,依此类推。 When you applied the index to it, the increment of byte offset is according to the size of the data type. 将索引应用于索引时,字节偏移量的增量取决于数据类型的大小。

The most important thing is, the integrity of the memory block to your data type. 最重要的是,存储块的完整性取决于您的数据类型。 That is, you should not access a location which exceed the size of the memory block. 也就是说,您不应访问超出存储块大小的位置。 Say you have 10 bytes of memory and you pointer is int *a , then accessing of a[1] is just access violation. 假设您有10个字节的内存,并且指针是int *a ,那么访问a[1]就是访问冲突。

Can I re-cast an entire array from *void to *int? 我可以将整个数组从* void重铸为* int吗?

I believe there's no such thing of a void array . 我相信没有void array这样的东西。 For the casting of pointer types, you are free to do so in C. 对于强制转换指针类型,您可以在C中随意进行。

Ok, so myfread looks something like this: 好的, myfread看起来像这样:

long myfread(char *infile, void **data, size_t *datasize)
{
   FILE *f = fopen(infile, "rb");   // Or some such.  
   ... 

   *datasize = ... // some calculation of some sort, e.g. seek to end of file?

   *data = malloc(*datasize ... );   // Maybe more calculation? 

   res = fread(f, data, datasize); 

   fclose(f);

   return res;
}

And then later, you want to convert the updated *data as an int * ? 然后,您想将更新后的*data转换为int *吗?

int *my_int_array; 

n = myfread(infile,&tempdata,&datasize);

my_int_array = tempdata;   // If a C++ compiler, you need a cast to (int *)

for(int i = 0; i < datasize; i++)
{
   printf("%d\n", my_int_array[i]); 
}

Of course, if myfredad doesn't do what I think it does, all bets are off. 当然,如果myfredad不做我认为做的事,那么所有的赌注都没有了。

Based on your edited question, I can make a guess as to what myfread looks like. 根据您编辑的问题,我可以猜测一下myfread外观。 Simplified tremendously, it does something like this: 大大简化了,它做了这样的事情:

long myfread(const char *path, void **pmem, size_t *datasize) {
    long magically_found = 42;
    int *mem;
    int i;

    mem = malloc(magically_found * sizeof(int)); /* and we assume it works */
    *datasize = 12345;
    for (i = 0; i < magically_found; i++)
        mem[i] = i;
    *pmem = mem;
    return magically_found;
}

Now, in your main , you have to somehow know that if datasize == 12345 upon return, the allocated memory has been filled with int s. 现在,在您的main ,您必须以某种方式知道,如果返回时datasize == 12345 ,则分配的内存已被int填充。 Knowing this, you then simply write: 知道这一点,然后您只需编写:

    int *ip;
    ... /* your code from above, more or less */
    if (datasize != 12345) {
        panic("memory was not filled with ints");
        /* NOTREACHED */
    }
    ip = tempdata;

From here on you can access ip[i] , for any valid i (at least 0 and less than n ). 从这里开始,您可以访问ip[i] ,以获取任何有效的i (至少0且小于n )。

The tougher question is, how do you know that 12345 means int and what the heck do you do if it's not 12345? 更为棘手的问题是,您怎么知道12345表示int ?如果不是 12345,您会怎么做? And, probably 12345 does not mean int anyway. 而且,大概12345并不意味着int Maybe 4 means int or float which both happen to have a sizeof of 4, in which case, having datasize == 4 does not tell you which one it is after all! 也许4表示int or float都碰巧都具有4的sizeof ,在这种情况下,具有datasize == 4并不能告诉您它到底是哪一个! So, then what? 那又怎样

All in all, it sounds like the question is underspecified, at least. 总而言之,至少这个问题听起来似乎不够明确。

I'm having a hard time understanding what you want, and I think you might be too. 我很难理解你想要什么,我想你也可能如此。 It seems like you have a function similar to read or fread that takes an argument of type void * for where to store the data it reads. 似乎您有一个类似于readfread的函数,该函数接受类型为void *的参数来存储所读取的数据。 This does not mean you make a variable of type void * to pass to it. 并不意味着您要使类型为void *的变量传递给它。 Instead, you pass the address of the object you want the data stored into. 相反,您传递要将数据存储到的对象的地址

In your case, simply make an array of int of the appropriate size and pass the address of that array (or the address of its first element) to the function that does the reading. 在您的情况下,只需创建一个大小合适的int数组,然后将该数组的地址(或其第一个元素的地址)传递给进行读取的函数。 For example (assuming fread ): 例如(假设fread ):

int my_array[100];
fread(my_array, sizeof my_array, 1, f);

If you don't know the size in advance, or if it needs to live past the return of the calling function, you can allocate space for the array with malloc . 如果您不预先知道大小,或者如果它需要保留在调用函数返回之前,则可以使用malloc为数组分配空间。

for(i = 0; i < index_max; i++) {
    printf("%d\n", ((int*)tempdata)[i]);
}

Yes, you can cast a pointer to another type, but it's hard to avoid undefined behavior if you do so. 是的,您可以将指针强制转换为另一种类型,但是如果这样做,很难避免发生未定义的行为。 For example, you have to make sure the binary data you're casting is aligned correctly, and that the memory representation in the code that wrote the data is the same as the memory representation of the code that's reading it. 例如,您必须确保要投射的二进制数据正确对齐,并且写入数据的代码中的内存表示形式与读取数据的代码中的内存表示形式相同。 This isn't just an academic problem, as you're likely to find endian differences across architectures, and that, for example, doubles have to be carefully aligned on ARM machines. 这不仅是一个学术问题,因为您可能会发现跨体系结构的字节顺序差异,例如,必须在ARM机器上仔细调整双精度。

You can solve the alignment problems by writing functions that access the memory as if it was a typed array, using memcpy. 您可以使用memcpy通过编写访问内存的函数来解决对齐问题,就好像它是类型化数组一样。 For example, 例如,

int get_int(const char *array, int idx) {
    int result;
    memcpy(&result, array + idx * sizeof(int), sizeof(int));
    return result;
}

To avoid writing this out N times, you can macroize it. 为避免将其写出N次,可以对其进行宏化。

#define MAKE_GET(T) T get_##T (const char *array, int idx) { \
    T result; \
    memcpy(&result, array + idx * sizeof(T), sizeof(T)); \
    return result; \
}

MAKE_GET(int)
MAKE_GET(float)
MAKE_GET(double)

To solve the endian problem, or more generally the problem that memory representations can differ across machines, you need to have a well-defined format for your binary file (for example, always writing ints little-endian). 要解决字节序问题,或更普遍地说,是解决内存表示在不同机器之间可能不同的问题,您需要为二进制文件定义一个明确定义的格式(例如,始终编写int little-endian)。 One good approach is to use text, (compressed with zlib or similar if you need it small). 一种好的方法是使用文本(如果需要,可以使用zlib压缩或类似格式压缩)。 Another is to use a serialisation library (for example, Google's protocol buffers). 另一个是使用序列化库(例如Google的协议缓冲区)。 Or you can roll your own - it's not too hard. 或者,您也可以自己动手-并不难。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM