简体   繁体   中英

Speed up Group by in postgres

Hi i want to create an statistic select in postgres

createddate is an timestamp without timezone

SELECT createddate, count(*) FROM useractivitystatisticsentity GROUP BY createddate

The plan looks like that

GroupAggregate  (cost=232569.83..256698.22 rows=1378765 width=8)
  ->  Sort  (cost=232569.83..236016.75 rows=1378765 width=8)
        Sort Key: createddate
        ->  Seq Scan on useractivitystatisticsentity  (cost=0.00..54268.65 rows=1378765 width=8)

but the plan didn't change after adding an index

CREATE INDEX ysdfg
  ON useractivitystatisticsentity
  USING btree
  (createddate );

any ideas how to speed tings up? it takes about 100sec at 1.000.000 rows

I've never seen anyone group by a timestamp - you must have a lot of interactions if you need do a count for every microsecond of time (the granularity of the timestamp dara type in Postgres).

In case you really meant to group by date :

SELECT createddate :: date, count(*)
FROM useractivitystatisticsentity
GROUP BY 1

or if you don't like casts, this also works:

SELECT date_trunc('day', createddate), count(*)
FROM useractivitystatisticsentity
GROUP BY 1

If the above doesn't help, you could first try updating the table statistics with analyze :

analyze useractivitystatisticsentity

Query plan depends on cardinality of data in your table - check this sql fiddle demo . The number of rows is equal in both tables, but cardinality is different, so optimizer choose different plans.

I think it's hard to be more specific without knowing your data.

You may find this links useful:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM