简体   繁体   中英

Postgresql: variable length user-defined data type storage setting

I define a variable length user-defined data type in postgresql according to the docs ( http://www.postgresql.org/docs/9.0/static/xtypes.html )

C definition:

typedef struct MyType {
    char    vl_len_[4];
    char    data[1];
} mytype;

CREATE TYPE statements

CREATE TYPE mytype;
CREATE FUNCTION mytype_in(cstring) RETURNS mytype AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_out(mytype) RETURNS cstring AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_recv(internal) RETURNS mytype AS 'mytype' LANGUAGE C IMMUTABLE STRICT;
CREATE FUNCTION mytype_send(mytype) RETURNS bytea AS 'mytype' LANGUAGE C IMMUTABLE STRICT;

CREATE TYPE mytype (
 internallength = VARIABLE,
 input = mytype_in,
 output = mytype_out,
 receive = mytype_recv,
 send = mytype_send,
 alignment = int4
 ,storage = plain
);

And I also define the functions in C. All of these work well. However, since my data could be very long, I change the storage from plain to external or extended . Then it outputs wrong result. Is there some TOAST function I need to use in my C-functions?

For example:

I have an operator to merge two values as follows:

PG_FUNCTION_INFO_V1(mytype_add);

Datum
mytype_add(PG_FUNCTION_ARGS)
{
    mytype *anno1 = (mytype *) PG_GETARG_POINTER(0);
    mytype *anno2 = (mytype *) PG_GETARG_POINTER(1);
    mytype    *result;
    int     newsize;

    newsize = VARSIZE(anno1) + VARSIZE(anno2) - VARHDRSZ;
    result = (mytype *) palloc(newsize);
    SET_VARSIZE(result, newsize);
    memcpy(result->data, anno1->data, VARSIZE(anno1) - VARHDRSZ);
    memcpy((result->data + VARSIZE(anno1) - VARHDRSZ), anno2->data, VARSIZE(anno2) - VARHDRSZ);

    PG_RETURN_POINTER(result);
}

The values in the anno1->data (12 bytes, 3 integers) are: 10, -1, -1, the values in anno2->data are: 20, -1, -1

So the values in result->data (24 bytes) are: 10,-1,-1,20,-1,-1

If I set the storage as plain, I got above correct result. If I set the storage as external, the output is totally wrong: -256,-1,1317887 ...

Thanks very much if anyone can give any hint. I have spend many hours on this

You are failing to de TOAST the input Datum . So you're concatenating a compressed form, or possibly a pointer to out-of-line storage, rather than the raw data.

I think you need to use PG_GETARG_VARLENA_P(0) to ensure the datum is detoasted before working with it. I have not worked directly with TOAST and varlena types much, though.

It's not clear to me why you're declaring your own type with an identical structure to struct varlena , rather than just using Datum and the underlying struct varlena for variable-length datums. Start with:

struct varlena *anno1 = PG_GETARG_VARLENA_P(0);

On a side note, why are you trying to re-implement intarray (badly, ie using char arrays)? Please read this relevant article , and this one .

Add one more thing to it, the main difference is the structure that saves are different, with a pre-header that indicate the length of the data.

Therefore, when writing input function, you would need in implement a 4 byte header before your data start and use "SET_VARSIZE(PTR,len)" to alter the value of the 4 byte header.

On the other hand, when retrieve the data, you would need to use "PG_GETARG_VARLENA_P(n)", and the retrieved results would also contain a 4 byte header that indicate length. You can get the length by using "VARSIZE_4B(PTR)" and it will return the byte length of the data.

To summary and giving out the sample code, we assume we wanna store a non-known number of struct complex:

typedef struct Complex 
{
    double      x;
    double      y;
} Complex;

So after receiving input string, we decided we would need to store n numbers of struct. Therefore, allocate memory:

struct varlena* result = (struct varlena*)palloc(n * sizeof(Complex) + 4);

As stated in the documentation, we need to edit first 4 byte and set length:

SET_VARSIZE(result, n * sizeof(Complex));

The following byte, we should assign them with values, remember the address should be aligned to your system structure:

Complex * a = (Complex*)((__int64)result + 4);
for (int i = 0; i < n; i++) {
    a[i].x = input[i];
    a[i].y = input[i];
}

Finally, the data should be stored by:

PG_RETURN_POINTER(result);

To retrieve the data, need to use

struct varlen *b = PG_GETARG_VARLENA_P(0);

As stated above, also the result is going to have 4 byte at front stating the length, the output function could be:

Complex *c = (Complex *)(&(b->vl_dat));
char *result;
int n = VARSIZE_ANY_EXHDR(b) / sizeof(Complex);
for (int i = 0; i < n; i++) {
    result = psprintf("(%g;%g)", c[i].x, c[i].y);
}
PG_RETURN_CSTRING(result);

I haven't tested this exact code but a similar one, the result should be OK. It is nice if anyone could add comment on this or correct any mistakes I made. This is also for myself's reference.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM