简体   繁体   中英

PostgreSQL: Query doesn't work using count

WITH hi AS (
  SELECT ps.id, ps.brgy_locat, ps.municipali, ps.bldg_name, fh.gridcode, ps.bldg_type
  FROM evidensapp_polystructures ps
  JOIN evidensapp_floodhazard fh ON fh.gridcode=3
                                 AND ST_Intersects(fh.geom, ps.geom)
), med AS (
  SELECT ps.id, ps.brgy_locat, ps.municipali ,ps.bldg_name, fh.gridcode, ps.bldg_type
  FROM evidensapp_polystructures ps
  JOIN evidensapp_floodhazard fh ON fh.gridcode=2
                                 AND ST_Intersects(fh.geom, ps.geom)
  EXCEPT SELECT * FROM hi
), low AS (
  SELECT ps.id, ps.brgy_locat, ps.municipali,ps.bldg_name, fh.gridcode, ps.bldg_type
  FROM evidensapp_polystructures ps
  JOIN evidensapp_floodhazard fh ON fh.gridcode=1
                                 AND ST_Intersects(fh.geom, ps.geom)
  EXCEPT SELECT * FROM hi
  EXCEPT SELECT * FROM med
)
SELECT brgy_locat, municipali, bldg_name,  bldg_type, gridcode, count( bldg_name)
FROM (SELECT brgy_locat, municipali, bldg_name, gridcode, bldg_type
      FROM hi
      GROUP BY 1, 2, 3, 4, 5) cnt_hi
FULL JOIN (SELECT brgy_locat, municipali,bldg_name, gridcode, bldg_type
      FROM med
      GROUP BY 1, 2, 3, 4, 5) cnt_med USING (brgy_locat, municipali, bldg_name,gridcode,bldg_type)
FULL JOIN (SELECT brgy_locat, municipali,bldg_name,gridcode, bldg_type
      FROM low
      GROUP BY 1, 2, 3, 4, 5) cnt_low USING (brgy_locat, municipali, bldg_name, gridcode, bldg_type)

The query above returns an error:

ERROR: column "cnt_hi.brgy_locat" must appear in the GROUP BY clause or be used in an aggregate function ********** Error **********

ERROR: column "cnt_hi.brgy_locat" must appear in the GROUP BY clause or be used in an aggregate function SQL state: 42803

But if I omit the count(bldg_name) it works. But I need to count based on bldg_name .

EDIT: I wanted to get the number of buildings that intersect with the hazard value(gridcode): High(3), Medium(2) and Low(1). But, if a certain geometry intersects already in High,exclude in it Medium query and same goes with Low exclude those geometry that intersects in High and Medium.

PostgreSQL: 9.4, PostGIS: 2.1.7

Table Details:

CREATE TABLE evidensapp_floodhazard (
  id integer NOT NULL DEFAULT nextval('evidensapp_floodhazard_id_seq'::regclass),
  gridcode integer NOT NULL,
  date_field character varying(60),
  geom geometry(MultiPolygon,32651),
  CONSTRAINT evidensapp_floodhazard_pkey PRIMARY KEY (id)
);

CREATE INDEX evidensapp_floodhazard_geom_id
  ON evidensapp_floodhazard USING gist (geom);

ALTER TABLE evidensapp_floodhazard CLUSTER ON evidensapp_floodhazard_geom_id;

CREATE TABLE evidensapp_polystructures (
  id serial NOT NULL,
  bldg_name character varying(100) NOT NULL,
  bldg_type character varying(50) NOT NULL,
  brgy_locat character varying(50) NOT NULL,
  municipali character varying(50) NOT NULL,
  province character varying(50) NOT NULL,
  geom geometry(MultiPolygon,32651),
  CONSTRAINT evidensapp_polystructures_pkey PRIMARY KEY (id)
);

CREATE INDEX evidensapp_polystructures_geom_id
  ON evidensapp_polystructures USING gist (geom);

ALTER TABLE evidensapp_polystructures CLUSTER ON evidensapp_polystructures_geom_id;

Intended output is like this but with correct count: 在此处输入图片说明

EDIT 2: As much as I try my best to explain what the intended output is, anyway:

  • count the bldg_name not the id in which what gridcode it intersects in floodhazard with the condition as mentioned above on the EDIT 1 .
  • then group it to what brgy_locat , brgy_municipali and what gridcode and bldg_type it belong.

Kindly take a look at the image above.

You probably want this instead:

WITH hi AS (
   SELECT ps.brgy_locat, ps.municipali, ps.bldg_name, ps.bldg_type, fh.gridcode
        , count(*) OVER(PARTITION BY ps.bldg_name, ps.bldg_type) AS building_count
   FROM   evidensapp_polystructures ps
   JOIN   evidensapp_floodhazard    fh ON fh.gridcode = 3
                                      AND ST_Intersects(fh.geom, ps.geom)
   )
, med AS (
   SELECT ps.brgy_locat, ps.municipali, ps.bldg_name, ps.bldg_type, fh.gridcode
        , count(*) OVER(PARTITION BY ps.bldg_name, ps.bldg_type) AS building_count
   FROM   evidensapp_polystructures ps
   JOIN   evidensapp_floodhazard    fh ON fh.gridcode = 2
                                      AND ST_Intersects(fh.geom, ps.geom)
   LEFT   JOIN hi USING (bldg_name, bldg_type)
   WHERE  hi.bldg_name IS NULL
   )
TABLE hi

UNION ALL
TABLE med

UNION ALL 
   SELECT ps.brgy_locat, ps.municipali, ps.bldg_name, ps.bldg_type, fh.gridcode
        , count(*) OVER(PARTITION BY ps.bldg_name, ps.bldg_type) AS building_count
   FROM   evidensapp_polystructures ps
   JOIN   evidensapp_floodhazard    fh ON fh.gridcode = 1
                                      AND ST_Intersects(fh.geom, ps.geom)
   LEFT   JOIN hi USING (bldg_name, bldg_type)
   LEFT   JOIN med USING (bldg_name, bldg_type)
   WHERE  hi.bldg_name IS NULL
   AND    med.bldg_name IS NULL;

Based on your comments to the question and the chat, this counts per (bldg_name, bldg_type) now - excluding buildings that already intersect on a higher level - again based on (bldg_name, bldg_type) .

All other columns are either distinct ( id , geom ) or functionally dependent noise for the count ( brgy_locat , municipali , ...). If not , add more columns the PARTITION BY clause to disambiguate buildings. And add the same columns to the USING clause of the JOIN condition.

If a building intersects with multiple rows in evidensapp_floodhazard with the same gridcode it is counted that many times . See alternative blow.

Since you do not actually want to aggregate rows but just count on partitions, the key feature is using count() as window function , not as aggregate function like in your original. Basic explanation:

count(*) does a better job here:

Using LEFT JOIN / IS NULL instead of EXCEPT . Details:

And I failed to see the purpose of FULL JOIN in the outer query. Using UNION ALL instead.

Aternative query

This counts building once , no matter how many times it intersects with evidensapp_floodhazard on the same gridcode level

Also, this variant (unlike the first!) assumes that all rows for the same (bldg_name, bldg_type) match on the same gridcode level, which may or may not be the case:

SELECT brgy_locat, municipali, bldg_name, bldg_type, 3 AS gridcode
     , count(*) OVER(PARTITION BY bldg_name, bldg_type) AS building_count
FROM   evidensapp_polystructures ps
WHERE  EXISTS (
   SELECT 1 FROM evidensapp_floodhazard fh
   WHERE  fh.gridcode = 3 AND ST_Intersects(fh.geom, ps.geom)
   )

UNION ALL
SELECT brgy_locat, municipali, bldg_name, bldg_type, 2 AS gridcode
     , count(*) OVER(PARTITION BY bldg_name, bldg_type) AS building_count
FROM   evidensapp_polystructures ps
WHERE  EXISTS (
   SELECT 1 FROM evidensapp_floodhazard fh
   WHERE  fh.gridcode = 2 AND ST_Intersects(fh.geom, ps.geom)
   )
AND    NOT EXISTS (
   SELECT 1 FROM evidensapp_floodhazard fh
   WHERE  fh.gridcode > 2  -- exclude matches on **all** higher gridcodes
   AND    ST_Intersects(fh.geom, ps.geom)
   )

UNION ALL 
SELECT brgy_locat, municipali, bldg_name, bldg_type, 1 AS gridcode
     , count(*) OVER(PARTITION BY bldg_name, bldg_type) AS building_count
FROM   evidensapp_polystructures ps
WHERE  EXISTS (
   SELECT 1 FROM evidensapp_floodhazard fh
   WHERE  fh.gridcode = 1 AND ST_Intersects(fh.geom, ps.geom)
   )
AND    NOT EXISTS (
   SELECT 1 FROM evidensapp_floodhazard fh
   WHERE  fh.gridcode > 1 AND ST_Intersects(fh.geom, ps.geom)
   );

Also demonstrating a variant without CTEs, which may or may not perform better, depending on data distribution.

Index

Adding gridcode to the index might improve performance. (Not tested with PostGis):

You need to install the additional module btree_gist for this first. Details:

CREATE INDEX evidensapp_floodhazard_geom_id
  ON evidensapp_floodhazard USING gist (gridcode, geom);

Error is asking you to include the select list columns in the GROUP BY clause; you can do like below

SELECT brgy_locat, municipali, bldg_name,  bldg_type, 
gridcode, building_count
FROM (SELECT brgy_locat, municipali, bldg_name, gridcode, bldg_type,
      count( bldg_name) as building_count
      FROM hi
      GROUP BY 1, 2, 3, 4, 5) cnt_hi
FULL JOIN (SELECT brgy_locat, municipali,bldg_name, gridcode, bldg_type
      FROM med
      GROUP BY 1, 2, 3, 4, 5) cnt_med 
USING (brgy_locat, municipali, bldg_name,gridcode,bldg_type)
FULL JOIN (SELECT brgy_locat, municipali,bldg_name,gridcode, bldg_type
      FROM low
      GROUP BY 1, 2, 3, 4, 5) cnt_low 
USING (brgy_locat, municipali, bldg_name, gridcode, bldg_type);

I don't know if this will work for you since I don't have enough knowledge regarding postgresql. Also not sure if this will give you what you want. But, give this a try. You just need to include building_count on your using clause.

SELECT brgy_locat, municipali, bldg_name,  bldg_type, 
gridcode, building_count
FROM (SELECT brgy_locat, municipali, bldg_name, gridcode, bldg_type,
      count( bldg_name) as building_count
      FROM hi
      GROUP BY 1, 2, 3, 4, 5) cnt_hi
FULL JOIN (SELECT brgy_locat, municipali,bldg_name, gridcode, bldg_type,
      count(bldg_name) as building_count
      FROM med
      GROUP BY 1, 2, 3, 4, 5) cnt_med 
USING (brgy_locat, municipali, bldg_name,gridcode,bldg_type, building_count)
FULL JOIN (SELECT brgy_locat, municipali,bldg_name,gridcode, bldg_type,
      count(bldg_name) as building_count
      FROM low
      GROUP BY 1, 2, 3, 4, 5) cnt_low 
USING (brgy_locat, municipali, bldg_name, gridcode, bldg_type, building_count);

I'm not after the reputation..I just updated Rahul's answer. Hope it helps. Cheers! :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM