简体   繁体   English

将来自各种数据类型的值编码为二进制文件(python),然后在c中对其进行解码

[英]Encode values from various data types into a binary file (python) and then decode them in c

I need to store values coming from various data types in a binary file (in python) and then decode this binary file in c and reconstruct the values. 我需要将来自各种数据类型的值存储在二进制文件中(在python中),然后在c中解码该二进制文件并重建值。 For more clarification, lets assume we have three variables as below 为了进一步说明,假设我们有以下三个变量

a = [1, 2, 3], dtype = int
b = ['sky', 'chair', 'book', 'desk']
c = [3.56, 4.69, 55.0, 1.698], dtype = float32

step1: export all data values to a binary file (data) 步骤1:将所有数据值导出到二进制文件(数据)

step2: import the binary file in c and then reconstruct corresponding values. 步骤2:将二进制文件导入c中,然后重建相应的值。

//load binary file ?
...

// declaration
int a[3];
char *b[4];
double c[4];

// decode the binary file to have the same values in c ?
...
a = [1, 2, 3];
b = {'sky', 'chair', 'book', 'desk'};
c = [3.56, 4.69, 55.0, 1.698];

Thanks in advance for your helps, 预先感谢您的帮助,

I'm trying to have something that dynamically exports the data in a binary file and also dynamically retrieve in C; 我正在尝试一种可以动态导出二进制文件中的数据并且还可以在C中动态检索的东西。 a data structure like this: [number of integers, all integers values, number of floats, all float values, number of strings, all strings] 像这样的数据结构:[整数数量,所有整数值,浮点数,所有浮点值,字符串数,所有字符串]

For example: [3, 1, 2, 3, 4, 0.2, 0.65, 0.56, 0.33, 2,'sky','desk'] 例如:[3、1、2、3、4、0.2、0.65、0.56、0.33、2,“天空”,“办公桌”]

In summary, I also need to pass the number of elements for each data type for decoder in C; 总之,我还需要为C中的解码器传递每种数据类型的元素数量; If I know that values are stored in such an order (ints, floats, strs) 如果我知道值是以这种顺序存储的(int,float,strs)

  • step 1: Read first number (3), now we know that there are 3 int values. 步骤1:读取第一个数字(3),现在我们知道有3个int值。
  • step 2: Read the next tree numbers as ints. 步骤2:将下一个树编号读取为int。
  • step 3: Read the number of float values (4). 步骤3:读取浮点值的数量(4)。
  • step 4: read four float numbers. 步骤4:读取四个浮点数。
  • step 5: read the number of strings (2), 步骤5:读取字串数(2),
  • step 6: read the two strings. 步骤6:读取两个字符串。

I know that I can use struct package to create binary packs and then write them, but I don't know what Generic strategy should I follow with respect to the unpacking process which is being implemented in C (!!!). 我知道我可以使用struct包创建二进制包,然后编写它们,但是对于在C(!!!)中实现的拆包过程,我不知道应该遵循什么通用策略。

a data structure like this: [number of integers, all integers values, number of floats, all float values, number of strings, all strings] 像这样的数据结构:[整数数量,所有整数值,浮点数,所有浮点值,字符串数,所有字符串]

In preparation for writing the strings to the file, I'd convert them to bytes , eg: 为了准备将字符串写入文件,我将它们转换为bytes ,例如:

bb = [s.encode() for s in b]

The store part in Python amounts to providing the numbers and values along with an appropriate format string. Python的存储部分相当于提供数字和值以及适当的格式字符串。

  • integers: The straightforward format string 'I%di'%len(a) covers the number of integers I and all integers values %di , where %d is replaced by the number of items in a . 整数:简单的格式字符串'I%di'%len(a)覆盖整数 I所有整数值 %di ,其中%da的项目数代替。
  • floats: The straightforward format string 'I%df'%len(c) covers the number of floats I and all float values %df , where %d is replaced by the number of items in c . 浮点数:简单的格式字符串'I%df'%len(c)涵盖了浮点数 I所有浮点值 %df ,其中%dc的项目数代替。
  • strings: The format string is a little less straightforward because struct.pack doesn't allow a repeat count for strings, but requires the string length. 字符串:格式字符串不太直接,因为struct.pack不允许对字符串进行重复计数,但是需要字符串长度。 ''.join(['%ds'%(len(s)+1) for s in bb]) constructs the format for all string values %ds , where %d is replaced by the number of bytes in each string plus one for the terminating NUL. ''.join(['%ds'%(len(s)+1) for s in bb])构造所有字符串值 %ds的格式,其中%d替换为每个字符串中的字节数加一个对于终止的NUL。 (Failing detailed specification, I chose to store the strings in C form.) (未通过详细说明,我选择将字符串存储为C形式。)

This gives: 这给出:

data = struct.pack('I%di'%len(a)+'I%df'%len(c)+'I'+''.join(['%ds'%(len(s)+1) for s in bb]),
                    len(a), *a,   len(c), *c,   len(bb),       *bb)
open('data', 'wb').write(data)

The decode part in C is not very complicated, following the steps you outlined, eg: 按照您概述的步骤,C中的解码部分不是很复杂,例如:

#include <stdio.h>
#include <stdlib.h>
int geti(FILE *stream)
{   // helper function to read an integer
    int i;
    if (!fread(&i, sizeof i, 1, stream)) exit(EXIT_FAILURE);
    return i;
}
…
    // step 1: Read first number of int values.
    int ni = geti(stdin);
    int a[ni];
    // step 2: Read the next 'ni' numbers as ints.
    for (int i = 0; i < ni; ) a[i++] = geti(stdin);
    // step 3: Read the number of float values.
    int nf = geti(stdin);
    double c[nf];
    // step 4: read 'nf' float numbers.
    for (int i = 0; i < nf; )
    { float f; fread(&f, sizeof f, 1, stdin); c[i++] = f; }
    // step 5: read the number of strings,
    int ns = geti(stdin);
    char *b[ns];
    // step 6: read the 'ns' strings.
    for (int i = 0, j, c; i < ns; ++i)
    {
        b[i] = NULL, j = 0;
        do
        {
            b[i] = realloc(b[i], j+1);
            c = getc(stdin);
            if (c == EOF) exit(EXIT_FAILURE);
        } while (b[i][j++] = c);
    }

Note that in this example 注意在这个例子中

  1. of course you could use a stream other than stdin , 当然,您可以使用stdin以外的流,
  2. not all errors (read failure or out of memory) are checked, 并非检查所有错误(读取失败或内存不足),
  3. no provisions are made for the case that the number representations in source and destination are different; 没有规定源和目的地的数字表示形式不同的情况; for this, you could utilize struct.pack 's byte order and size indication and/or ntohl() in C. 为此,您可以在C中利用struct.pack的字节顺序和大小指示和/或ntohl()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM