Speed up Group by in postgres

Question

Hi i want to create an statistic select in postgres

createddate is an timestamp without timezone

SELECT createddate, count(*) FROM useractivitystatisticsentity GROUP BY createddate

The plan looks like that

GroupAggregate  (cost=232569.83..256698.22 rows=1378765 width=8)
  ->  Sort  (cost=232569.83..236016.75 rows=1378765 width=8)
        Sort Key: createddate
        ->  Seq Scan on useractivitystatisticsentity  (cost=0.00..54268.65 rows=1378765 width=8)

but the plan didn't change after adding an index

CREATE INDEX ysdfg
  ON useractivitystatisticsentity
  USING btree
  (createddate );

any ideas how to speed tings up? it takes about 100sec at 1.000.000 rows

Answer 1

I've never seen anyone group by a timestamp - you must have a lot of interactions if you need do a count for every microsecond of time (the granularity of the timestamp dara type in Postgres).

In case you really meant to group by date :

SELECT createddate :: date, count(*)
FROM useractivitystatisticsentity
GROUP BY 1

or if you don't like casts, this also works:

SELECT date_trunc('day', createddate), count(*)
FROM useractivitystatisticsentity
GROUP BY 1

If the above doesn't help, you could first try updating the table statistics with analyze :

analyze useractivitystatisticsentity

Answer 2

Query plan depends on cardinality of data in your table - check this sql fiddle demo . The number of rows is equal in both tables, but cardinality is different, so optimizer choose different plans.

I think it's hard to be more specific without knowing your data.

You may find this links useful:

Speed up Group by in postgres

Question

2 answers

solution1
3 ACCPTED 2013-09-03 13:54:01

solution2
1 2013-09-03 11:28:14

Speed up Group by in postgres

Question

2 answers

solution1 3 ACCPTED 2013-09-03 13:54:01

solution2 1 2013-09-03 11:28:14

solution1
3 ACCPTED 2013-09-03 13:54:01

solution2
1 2013-09-03 11:28:14