[英]Postgresql: variable length user-defined data type storage setting
I define a variable length user-defined data type in postgresql according to the docs ( http://www.postgresql.org/docs/9.0/static/xtypes.html ) 我根据docs( http://www.postgresql.org/docs/9.0/static/xtypes.html )在postgresql中定义了一个可变长度的用户定义数据类型。
C definition: C的定义:
typedef struct MyType {
char vl_len_[4];
char data[1];
} mytype;
CREATE TYPE statements CREATE TYPE语句
CREATE TYPE mytype;
CREATE FUNCTION mytype_in(cstring) RETURNS mytype AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_out(mytype) RETURNS cstring AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_recv(internal) RETURNS mytype AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_send(mytype) RETURNS bytea AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE TYPE mytype (
internallength = VARIABLE,
input = mytype_in,
output = mytype_out,
receive = mytype_recv,
send = mytype_send,
alignment = int4
,storage = plain
);
And I also define the functions in C. All of these work well. 而且我还定义了C语言中的函数。所有这些都能很好地工作。 However, since my data could be very long, I change the storage from plain to external or extended .
但是,由于我的数据可能很长,因此我将存储从普通更改为外部或扩展 。 Then it outputs wrong result.
然后输出错误结果。 Is there some TOAST function I need to use in my C-functions?
我的C函数中需要使用一些TOAST函数吗?
For example: 例如:
I have an operator to merge two values as follows: 我有一个运算符来合并两个值,如下所示:
PG_FUNCTION_INFO_V1(mytype_add);
Datum
mytype_add(PG_FUNCTION_ARGS)
{
mytype *anno1 = (mytype *) PG_GETARG_POINTER(0);
mytype *anno2 = (mytype *) PG_GETARG_POINTER(1);
mytype *result;
int newsize;
newsize = VARSIZE(anno1) + VARSIZE(anno2) - VARHDRSZ;
result = (mytype *) palloc(newsize);
SET_VARSIZE(result, newsize);
memcpy(result->data, anno1->data, VARSIZE(anno1) - VARHDRSZ);
memcpy((result->data + VARSIZE(anno1) - VARHDRSZ), anno2->data, VARSIZE(anno2) - VARHDRSZ);
PG_RETURN_POINTER(result);
}
The values in the anno1->data (12 bytes, 3 integers) are: 10, -1, -1, the values in anno2->data are: 20, -1, -1 anno1-> data中的值(12个字节,3个整数)为:10,-1,-1,anno2-> data中的值为:20,-1,-1
So the values in result->data (24 bytes) are: 10,-1,-1,20,-1,-1 因此result-> data(24个字节)中的值为:10,-1,-1,20,-1,-1
If I set the storage as plain, I got above correct result. 如果将存储设置为纯存储,则可获得正确的结果。 If I set the storage as external, the output is totally wrong: -256,-1,1317887 ...
如果将存储设置为外部存储,则输出完全错误:-256,-1,1317887 ...
Thanks very much if anyone can give any hint. 非常感谢任何人都可以提供任何提示。 I have spend many hours on this
我花了很多时间
You are failing to de TOAST the input Datum
. 您无法取消TOAST输入
Datum
。 So you're concatenating a compressed form, or possibly a pointer to out-of-line storage, rather than the raw data. 因此,您是在连接压缩形式,或者可能是指向离线存储的指针,而不是原始数据。
I think you need to use PG_GETARG_VARLENA_P(0)
to ensure the datum is detoasted before working with it. 我认为您需要使用
PG_GETARG_VARLENA_P(0)
来确保在处理基准面之前已将其删除。 I have not worked directly with TOAST and varlena types much, though. 不过,我还没有直接使用TOAST和varlena类型。
It's not clear to me why you're declaring your own type with an identical structure to struct varlena
, rather than just using Datum
and the underlying struct varlena
for variable-length datums. 我不清楚,为什么您要声明自己的类型,使其结构与
struct varlena
相同,而不是仅对可变长度数据使用Datum
和基础struct varlena
。 Start with: 从...开始:
struct varlena *anno1 = PG_GETARG_VARLENA_P(0);
On a side note, why are you trying to re-implement intarray
(badly, ie using char arrays)? 附带说明一下,为什么要尝试重新实现
intarray
(严重,即使用char数组)? Please read this relevant article , and this one . 请阅读这篇相关文章 ,以及这篇 。
Add one more thing to it, the main difference is the structure that saves are different, with a pre-header that indicate the length of the data. 再加上一点,主要的区别是保存的结构不同,带有一个预标题,用于指示数据的长度。
Therefore, when writing input function, you would need in implement a 4 byte header before your data start and use "SET_VARSIZE(PTR,len)" to alter the value of the 4 byte header. 因此,在编写输入函数时,您需要在数据开始之前实现一个4字节的标头,并使用“ SET_VARSIZE(PTR,len)”来更改4字节标头的值。
On the other hand, when retrieve the data, you would need to use "PG_GETARG_VARLENA_P(n)", and the retrieved results would also contain a 4 byte header that indicate length. 另一方面,在检索数据时,您将需要使用“ PG_GETARG_VARLENA_P(n)”,并且检索到的结果还将包含一个指示长度的4字节标头。 You can get the length by using "VARSIZE_4B(PTR)" and it will return the byte length of the data.
您可以使用“ VARSIZE_4B(PTR)”来获得长度,它将返回数据的字节长度。
To summary and giving out the sample code, we assume we wanna store a non-known number of struct complex: 为了总结并给出示例代码,我们假设我们要存储未知数量的struct complex:
typedef struct Complex
{
double x;
double y;
} Complex;
So after receiving input string, we decided we would need to store n numbers of struct. 因此,在收到输入字符串后,我们决定需要存储n个结构。 Therefore, allocate memory:
因此,分配内存:
struct varlena* result = (struct varlena*)palloc(n * sizeof(Complex) + 4);
As stated in the documentation, we need to edit first 4 byte and set length: 如文档中所述,我们需要编辑前4个字节并设置长度:
SET_VARSIZE(result, n * sizeof(Complex));
The following byte, we should assign them with values, remember the address should be aligned to your system structure: 接下来的字节,我们应该给它们分配值,记住地址应该与您的系统结构对齐:
Complex * a = (Complex*)((__int64)result + 4);
for (int i = 0; i < n; i++) {
a[i].x = input[i];
a[i].y = input[i];
}
Finally, the data should be stored by: 最后,数据应通过以下方式存储:
PG_RETURN_POINTER(result);
To retrieve the data, need to use 要检索数据,需要使用
struct varlen *b = PG_GETARG_VARLENA_P(0);
As stated above, also the result is going to have 4 byte at front stating the length, the output function could be: 如上所述,结果也将在前面声明长度为4个字节,输出函数可能是:
Complex *c = (Complex *)(&(b->vl_dat));
char *result;
int n = VARSIZE_ANY_EXHDR(b) / sizeof(Complex);
for (int i = 0; i < n; i++) {
result = psprintf("(%g;%g)", c[i].x, c[i].y);
}
PG_RETURN_CSTRING(result);
I haven't tested this exact code but a similar one, the result should be OK. 我没有测试过这个确切的代码,但是类似的代码,结果应该可以。 It is nice if anyone could add comment on this or correct any mistakes I made.
如果有人可以对此发表评论或纠正我犯的任何错误,那就太好了。 This is also for myself's reference.
这也供我自己参考。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.