将来自各种数据类型的值编码为二进制文件（python），然后在c中对其进行解码

Question

I need to store values coming from various data types in a binary file (in python) and then decode this binary file in c and reconstruct the values. 我需要将来自各种数据类型的值存储在二进制文件中（在python中），然后在c中解码该二进制文件并重建值。 For more clarification, lets assume we have three variables as below 为了进一步说明，假设我们有以下三个变量

a = [1, 2, 3], dtype = int
b = ['sky', 'chair', 'book', 'desk']
c = [3.56, 4.69, 55.0, 1.698], dtype = float32

step1: export all data values to a binary file (data) 步骤1：将所有数据值导出到二进制文件（数据）

step2: import the binary file in c and then reconstruct corresponding values. 步骤2：将二进制文件导入c中，然后重建相应的值。

//load binary file ?
...

// declaration
int a[3];
char *b[4];
double c[4];

// decode the binary file to have the same values in c ?
...
a = [1, 2, 3];
b = {'sky', 'chair', 'book', 'desk'};
c = [3.56, 4.69, 55.0, 1.698];

Thanks in advance for your helps, 预先感谢您的帮助，

I'm trying to have something that dynamically exports the data in a binary file and also dynamically retrieve in C; 我正在尝试一种可以动态导出二进制文件中的数据并且还可以在C中动态检索的东西。 a data structure like this: [number of integers, all integers values, number of floats, all float values, number of strings, all strings] 像这样的数据结构：[整数数量，所有整数值，浮点数，所有浮点值，字符串数，所有字符串]

For example: [3, 1, 2, 3, 4, 0.2, 0.65, 0.56, 0.33, 2,'sky','desk'] 例如：[3、1、2、3、4、0.2、0.65、0.56、0.33、2，“天空”，“办公桌”]

In summary, I also need to pass the number of elements for each data type for decoder in C; 总之，我还需要为C中的解码器传递每种数据类型的元素数量； If I know that values are stored in such an order (ints, floats, strs) 如果我知道值是以这种顺序存储的（int，float，strs）

step 1: Read first number (3), now we know that there are 3 int values. 步骤1：读取第一个数字（3），现在我们知道有3个int值。
step 2: Read the next tree numbers as ints. 步骤2：将下一个树编号读取为int。
step 3: Read the number of float values (4). 步骤3：读取浮点值的数量（4）。
step 4: read four float numbers. 步骤4：读取四个浮点数。
step 5: read the number of strings (2), 步骤5：读取字串数（2），
step 6: read the two strings. 步骤6：读取两个字符串。

I know that I can use struct package to create binary packs and then write them, but I don't know what Generic strategy should I follow with respect to the unpacking process which is being implemented in C (!!!). 我知道我可以使用struct包创建二进制包，然后编写它们，但是对于在C（!!!）中实现的拆包过程，我不知道应该遵循什么通用策略。

Answer 1

a data structure like this: [number of integers, all integers values, number of floats, all float values, number of strings, all strings] 像这样的数据结构：[整数数量，所有整数值，浮点数，所有浮点值，字符串数，所有字符串]

In preparation for writing the strings to the file, I'd convert them to bytes , eg: 为了准备将字符串写入文件，我将它们转换为bytes ，例如：

bb = [s.encode() for s in b]

The store part in Python amounts to providing the numbers and values along with an appropriate format string. Python的存储部分相当于提供数字和值以及适当的格式字符串。

integers: The straightforward format string 'I%di'%len(a) covers the number of integers I and all integers values %di , where %d is replaced by the number of items in a . 整数：简单的格式字符串'I%di'%len(a)覆盖整数 I和所有整数值 %di ，其中%d被a的项目数代替。
floats: The straightforward format string 'I%df'%len(c) covers the number of floats I and all float values %df , where %d is replaced by the number of items in c . 浮点数：简单的格式字符串'I%df'%len(c)涵盖了浮点数 I和所有浮点值 %df ，其中%d被c的项目数代替。
strings: The format string is a little less straightforward because struct.pack doesn't allow a repeat count for strings, but requires the string length. 字符串：格式字符串不太直接，因为struct.pack不允许对字符串进行重复计数，但是需要字符串长度。 ''.join(['%ds'%(len(s)+1) for s in bb]) constructs the format for all string values %ds , where %d is replaced by the number of bytes in each string plus one for the terminating NUL. ''.join(['%ds'%(len(s)+1) for s in bb])构造所有字符串值 %ds的格式，其中%d替换为每个字符串中的字节数加一个对于终止的NUL。 (Failing detailed specification, I chose to store the strings in C form.) （未通过详细说明，我选择将字符串存储为C形式。）

This gives: 这给出：

data = struct.pack('I%di'%len(a)+'I%df'%len(c)+'I'+''.join(['%ds'%(len(s)+1) for s in bb]),
                    len(a), *a,   len(c), *c,   len(bb),       *bb)
open('data', 'wb').write(data)

The decode part in C is not very complicated, following the steps you outlined, eg: 按照您概述的步骤，C中的解码部分不是很复杂，例如：

#include <stdio.h>
#include <stdlib.h>
int geti(FILE *stream)
{   // helper function to read an integer
    int i;
    if (!fread(&i, sizeof i, 1, stream)) exit(EXIT_FAILURE);
    return i;
}
…
    // step 1: Read first number of int values.
    int ni = geti(stdin);
    int a[ni];
    // step 2: Read the next 'ni' numbers as ints.
    for (int i = 0; i < ni; ) a[i++] = geti(stdin);
    // step 3: Read the number of float values.
    int nf = geti(stdin);
    double c[nf];
    // step 4: read 'nf' float numbers.
    for (int i = 0; i < nf; )
    { float f; fread(&f, sizeof f, 1, stdin); c[i++] = f; }
    // step 5: read the number of strings,
    int ns = geti(stdin);
    char *b[ns];
    // step 6: read the 'ns' strings.
    for (int i = 0, j, c; i < ns; ++i)
    {
        b[i] = NULL, j = 0;
        do
        {
            b[i] = realloc(b[i], j+1);
            c = getc(stdin);
            if (c == EOF) exit(EXIT_FAILURE);
        } while (b[i][j++] = c);
    }

Note that in this example 注意在这个例子中

of course you could use a stream other than stdin , 当然，您可以使用stdin以外的流，
not all errors (read failure or out of memory) are checked, 并非检查所有错误（读取失败或内存不足），
no provisions are made for the case that the number representations in source and destination are different; 没有规定源和目的地的数字表示形式不同的情况； for this, you could utilize struct.pack 's byte order and size indication and/or ntohl() in C. 为此，您可以在C中利用struct.pack的字节顺序和大小指示和/或ntohl() 。

将来自各种数据类型的值编码为二进制文件（python），然后在c中对其进行解码

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-08-07 08:42:26

将来自各种数据类型的值编码为二进制文件（python），然后在c中对其进行解码

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-08-07 08:42:26

解决方案1
1 已采纳 2019-08-07 08:42:26