简体   繁体   中英

Find median without using window functions

I have a dataset with columns - customer, product, and quantity. I want to find the median of quantity with respect to different products. Assuming we need to work only for an odd number of rows.

Functions like With, Join, aggregate functions like count, avg, max, min, etc are allowed. A solution using nested subqueries would be ideal for this question.

So far I have listed down the quantity and product in sorted order and found the median number using ROUND((COUNT(QUANT) / 2) + 1) and now I need to find that median row without using any window function.

Input

Product Quantity
A 1
A 2
A 3
B 5
B 6
B 7
C 11
C 13
C 15
D 4
D 5
D 6

Output

Product Median
A 2
B 6
C 13
D 5

You can try these two solutions where prod is the name of your table:

Solution 1 with Window function

SELECT DISTINCT ON (t.Product)
       t.Product
     , nth_value(t.Quantity, m.median) OVER (PARTITION BY t.Product ORDER BY Quantity ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS median_Qty
  FROM prod AS t
 INNER JOIN
     ( SELECT Product, round(count(*)/2+1) :: integer AS median
         FROM prod
        GROUP BY Product
     ) AS m
    ON m.Product = t.Product

Solution 2 without Window function

This solution is based on the SELECT... LIMIT 1 OFFSET median query where median is a value only known at the run time, so this query must be implemented as a dynamic statement within a plpgsql function whose input parameter is the Product:

CREATE OR REPLACE FUNCTION median_quantity(INOUT myProduct varchar(1), OUT Median_Quantity integer)
RETURNS setof record LANGUAGE plpgsql AS
$$
DECLARE
  median integer ;
BEGIN
    SELECT round(count(*)/2) :: integer
      INTO median
      FROM prod
     WHERE Product = myProduct ;
     
    RETURN QUERY EXECUTE E'
    SELECT Product, Quantity
      FROM prod
     WHERE Product = ' || quote_nullable(myProduct) || '
     ORDER BY Quantity
     LIMIT 1
     OFFSET ' || median ;
END ;
$$ ;

The expected result is given by the query:

SELECT m.myProduct, m.Median_Quantity
  FROM 
     ( SELECT DISTINCT ON (Product)
              Product
         FROM prod
     ) AS p
 CROSS JOIN LATERAL median_quantity(p.Product) AS m

All the details in dbfiddle

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM