简体   繁体   中英

Postgresql user-defined aggregate function count

I am defining time-varying integers, ie, arrays of time-varying integer segments, the latter are integer values associated with a timestamp range.

CREATE TYPE integerTS AS (val integer, p tsrange);
CREATE TYPE integerTT AS (traj integerTS[]);

An example of such a value is

select integerTT(ARRAY[
integerTS(1, '2012-01-01 08:00:00', '2012-01-01 08:10:00'),
integerTS(2, '2012-01-01 08:10:00', '2012-01-01 08:20:00')
])

I am able to define min, max, and sum aggregate functions but over these types, for example

WITH Values AS (
SELECT integerTT(ARRAY[
    integerTS(3, '2012-01-01 08:00:00', '2012-01-01 08:20:00')]) as val
UNION
SELECT integerTT(ARRAY[
    integerTS(2, '2012-01-01 08:10:00', '2012-01-01 08:30:00')])
UNION
SELECT integerTT(ARRAY[
    integerTS(1, '2012-01-01 08:20:00', '2012-01-01 08:40:00')])
)
SELECT min(val)
from Values

result in

integerTT(ARRAY[
    integerTS(3, '2012-01-01 08:00:00', '2012-01-01 08:10:00'),
    integerTS(2, '2012-01-01 08:10:00', '2012-01-01 08:20:00'),
    integerTS(1, '2012-01-01 08:20:00', '2012-01-01 08:40:00')])

However, I cannot define a count aggregate function for time-varying integers. I started to define the functions as follows.

CREATE OR REPLACE FUNCTION count(tt1 integerTT, tt2 integerTT) RETURNS integerTT AS
$BODY$
DECLARE 
BEGIN
    -- 0 is a dummy value
    return integerTT(count(integerTS(0,getT(tt1)), integerTS(0,getT(tt2))));
END;
$BODY$ LANGUAGE plpgsql IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION count(ts1 integerTS, ts2 integerTS) RETURNS integerTS[] AS
$BODY$
DECLARE 
    intersection tsrange;
    result integerTS[];
BEGIN
    intersection := getT(ts1) * getT(ts2);
    IF isempty(intersection) THEN
        IF getT(ts1) << getT(ts2) THEN
            result := ARRAY[integerTS(1,getT(ts1)),integerTS(1,getT(ts2))];
        ELSE 
            result := ARRAY[integerTS(1,getT(ts2)),integerTS(1,getT(ts1))];
        END IF;
    ELSE 
        IF lower(getT(ts1)) < lower(intersection) THEN
        result := array_append(result,integerTS(1,tsrange(lower(getT(ts1)), lower(intersection))));
        END IF;
        IF lower(getT(ts2)) < lower(intersection) THEN
        result := array_append(result,integerTS(1,tsrange(lower(getT(ts2)), lower(intersection))));
        END IF;
        result := array_append(result,integerTS(2,intersection));   
        IF upper(intersection) < upper(getT(ts1)) THEN
        result := array_append(result,integerTS(1,tsrange(upper(intersection), upper(getT(ts1)))));
        END IF;
        IF upper(intersection) < upper(getT(ts2)) THEN
        result := array_append(result,integerTS(1,tsrange(upper(intersection), upper(getT(ts2)))));
        END IF;
    END IF;
    RETURN result;
END;
$BODY$ LANGUAGE plpgsql IMMUTABLE STRICT;

Then

WITH Values AS (
SELECT integerTT(ARRAY[
    integerTS(3, '2012-01-01 08:00:00', '2012-01-01 08:20:00')]) as val
UNION
SELECT integerTT(ARRAY[
    integerTS(2, '2012-01-01 08:10:00', '2012-01-01 08:30:00')])
UNION
SELECT integerTT(ARRAY[
    integerTS(1, '2012-01-01 08:20:00', '2012-01-01 08:40:00')])
)
SELECT count(val)
from Values

result in a correct value as follows

integerTT(ARRAY[
    integerTS(1, '2012-01-01 08:00:00', '2012-01-01 08:10:00'),
    integerTS(2, '2012-01-01 08:10:00', '2012-01-01 08:30:00'),
    integerTS(1, '2012-01-01 08:30:00', '2012-01-01 08:40:00')])

However, when the aggregate function defined as follows

CREATE AGGREGATE count (integerTT)
(
    sfunc = count,
    stype = integerTT
);

does not work since it always return 1 and 2 count values. For example

WITH Values AS (
SELECT integerTT(ARRAY[
    integerTS(3, '2012-01-01 08:00:00', '2012-01-01 08:20:00')]) as val
UNION
SELECT integerTT(ARRAY[
    integerTS(4, '2012-01-01 08:00:00', '2012-01-01 08:20:00')]) as val
UNION
SELECT integerTT(ARRAY[
    integerTS(5, '2012-01-01 08:00:00', '2012-01-01 08:20:00')]) as val
UNION
SELECT integerTT(ARRAY[
    integerTS(6, '2012-01-01 08:00:00', '2012-01-01 08:20:00')]) as val
UNION
SELECT integerTT(ARRAY[
    integerTS(7, '2012-01-01 08:00:00', '2012-01-01 08:20:00')]) as val
)
SELECT count(val)
from Values

returns

integerTT(ARRAY[
    integerTS(2, '2012-01-01 08:00:00', '2012-01-01 08:20:00')])

while it should return

integerTT(ARRAY[
    integerTS(5, '2012-01-01 08:00:00', '2012-01-01 08:20:00')])

How to define the time-varying count ? I am using PostgreSQL version 9.4.1.

While this is not answer to what you actually ask, it's the answer I would actually give. It's too complex for a comment.

Just because you can create a type that is an array of a composite type consisting of an integer and a range type, and even create aggregate functions on top of that ... does not mean it's a good idea to do it.

It is much simper, faster, less error prone and requires less disk space to implement integerTT as actual table :

CREATE tbl (
  tbl_id serial PRIMARY KEY
, batch_id int  -- to wrap a set of "integerTS"
, val integer   -- your "integerTS" decomposed
, p tsrange     -- your "integerTS" decomposed
);

And then write plain SQL queries with existing (much faster) aggregate functions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM