简体   繁体   English

Postgresql:可变长度用户定义的数据类型存储设置

[英]Postgresql: variable length user-defined data type storage setting

I define a variable length user-defined data type in postgresql according to the docs ( http://www.postgresql.org/docs/9.0/static/xtypes.html ) 我根据docs( http://www.postgresql.org/docs/9.0/static/xtypes.html )在postgresql中定义了一个可变长度的用户定义数据类型。

C definition: C的定义:

typedef struct MyType {
    char    vl_len_[4];
    char    data[1];
} mytype;

CREATE TYPE statements CREATE TYPE语句

CREATE TYPE mytype;
CREATE FUNCTION mytype_in(cstring) RETURNS mytype AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_out(mytype) RETURNS cstring AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_recv(internal) RETURNS mytype AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_send(mytype) RETURNS bytea AS 'mytype' LANGUAGE C IMMUTABLE STRICT;

CREATE TYPE mytype (
 internallength = VARIABLE,
 input = mytype_in,
 output = mytype_out,
 receive = mytype_recv,
 send = mytype_send,
 alignment = int4
 ,storage = plain
);

And I also define the functions in C. All of these work well. 而且我还定义了C语言中的函数。所有这些都能很好地工作。 However, since my data could be very long, I change the storage from plain to external or extended . 但是,由于我的数据可能很长,因此我将存储从普通更改为外部扩展 Then it outputs wrong result. 然后输出错误结果。 Is there some TOAST function I need to use in my C-functions? 我的C函数中需要使用一些TOAST函数吗?

For example: 例如:

I have an operator to merge two values as follows: 我有一个运算符来合并两个值,如下所示:

PG_FUNCTION_INFO_V1(mytype_add);

Datum
mytype_add(PG_FUNCTION_ARGS)
{
    mytype *anno1 = (mytype *) PG_GETARG_POINTER(0);
    mytype *anno2 = (mytype *) PG_GETARG_POINTER(1);
    mytype    *result;
    int     newsize;

    newsize = VARSIZE(anno1) + VARSIZE(anno2) - VARHDRSZ;
    result = (mytype *) palloc(newsize);
    SET_VARSIZE(result, newsize);
    memcpy(result->data, anno1->data, VARSIZE(anno1) - VARHDRSZ);
    memcpy((result->data + VARSIZE(anno1) - VARHDRSZ), anno2->data, VARSIZE(anno2) - VARHDRSZ);

    PG_RETURN_POINTER(result);
}

The values in the anno1->data (12 bytes, 3 integers) are: 10, -1, -1, the values in anno2->data are: 20, -1, -1 anno1-> data中的值(12个字节,3个整数)为:10,-1,-1,anno2-> data中的值为:20,-1,-1

So the values in result->data (24 bytes) are: 10,-1,-1,20,-1,-1 因此result-> data(24个字节)中的值为:10,-1,-1,20,-1,-1

If I set the storage as plain, I got above correct result. 如果将存储设置为纯存储,则可获得正确的结果。 If I set the storage as external, the output is totally wrong: -256,-1,1317887 ... 如果将存储设置为外部存储,则输出完全错误:-256,-1,1317887 ...

Thanks very much if anyone can give any hint. 非常感谢任何人都可以提供任何提示。 I have spend many hours on this 我花了很多时间

You are failing to de TOAST the input Datum . 您无法取消TOAST输入Datum So you're concatenating a compressed form, or possibly a pointer to out-of-line storage, rather than the raw data. 因此,您是在连接压缩形式,或者可能是指向离线存储的指针,而不是原始数据。

I think you need to use PG_GETARG_VARLENA_P(0) to ensure the datum is detoasted before working with it. 我认为您需要使用PG_GETARG_VARLENA_P(0)来确保在处理基准面之前已将其删除。 I have not worked directly with TOAST and varlena types much, though. 不过,我还没有直接使用TOAST和varlena类型。

It's not clear to me why you're declaring your own type with an identical structure to struct varlena , rather than just using Datum and the underlying struct varlena for variable-length datums. 我不清楚,为什么您要声明自己的类型,使其结构与struct varlena相同,而不是仅对可变长度数据使用Datum和基础struct varlena Start with: 从...开始:

struct varlena *anno1 = PG_GETARG_VARLENA_P(0);

On a side note, why are you trying to re-implement intarray (badly, ie using char arrays)? 附带说明一下,为什么要尝试重新实现intarray (严重,即使用char数组)? Please read this relevant article , and this one . 阅读这篇相关文章 ,以及这篇

Add one more thing to it, the main difference is the structure that saves are different, with a pre-header that indicate the length of the data. 再加上一点,主要的区别是保存的结构不同,带有一个预标题,用于指示数据的长度。

Therefore, when writing input function, you would need in implement a 4 byte header before your data start and use "SET_VARSIZE(PTR,len)" to alter the value of the 4 byte header. 因此,在编写输入函数时,您需要在数据开始之前实现一个4字节的标头,并使用“ SET_VARSIZE(PTR,len)”来更改4字节标头的值。

On the other hand, when retrieve the data, you would need to use "PG_GETARG_VARLENA_P(n)", and the retrieved results would also contain a 4 byte header that indicate length. 另一方面,在检索数据时,您将需要使用“ PG_GETARG_VARLENA_P(n)”,并且检索到的结果还将包含一个指示长度的4字节标头。 You can get the length by using "VARSIZE_4B(PTR)" and it will return the byte length of the data. 您可以使用“ VARSIZE_4B(PTR)”来获得长度,它将返回数据的字节长度。

To summary and giving out the sample code, we assume we wanna store a non-known number of struct complex: 为了总结并给出示例代码,我们假设我们要存储未知数量的struct complex:

typedef struct Complex 
{
    double      x;
    double      y;
} Complex;

So after receiving input string, we decided we would need to store n numbers of struct. 因此,在收到输入字符串后,我们决定需要存储n个结构。 Therefore, allocate memory: 因此,分配内存:

struct varlena* result = (struct varlena*)palloc(n * sizeof(Complex) + 4);

As stated in the documentation, we need to edit first 4 byte and set length: 如文档中所述,我们需要编辑前4个字节并设置长度:

SET_VARSIZE(result, n * sizeof(Complex));

The following byte, we should assign them with values, remember the address should be aligned to your system structure: 接下来的字节,我们应该给它们分配值,记住地址应该与您的系统结构对齐:

Complex * a = (Complex*)((__int64)result + 4);
for (int i = 0; i < n; i++) {
    a[i].x = input[i];
    a[i].y = input[i];
}

Finally, the data should be stored by: 最后,数据应通过以下方式存储:

PG_RETURN_POINTER(result);

To retrieve the data, need to use 要检索数据,需要使用

struct varlen *b = PG_GETARG_VARLENA_P(0);

As stated above, also the result is going to have 4 byte at front stating the length, the output function could be: 如上所述,结果也将在前面声明长度为4个字节,输出函数可能是:

Complex *c = (Complex *)(&(b->vl_dat));
char *result;
int n = VARSIZE_ANY_EXHDR(b) / sizeof(Complex);
for (int i = 0; i < n; i++) {
    result = psprintf("(%g;%g)", c[i].x, c[i].y);
}
PG_RETURN_CSTRING(result);

I haven't tested this exact code but a similar one, the result should be OK. 我没有测试过这个确切的代码,但是类似的代码,结果应该可以。 It is nice if anyone could add comment on this or correct any mistakes I made. 如果有人可以对此发表评论或纠正我犯的任何错误,那就太好了。 This is also for myself's reference. 这也供我自己参考。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM