简体   繁体   English

转换为日期时未使用PostgreSQL时间戳索引

[英]PostgreSQL timestamp index not used when cast to date

This is my query: 这是我的查询:

SELECT
    i::date AS day,
    (SELECT COUNT(*) FROM genericevent WHERE event = 'chat_message' AND eventDate::date = i::date AND extra1 = 'public') AS message_public_total,
    (SELECT COUNT(*) FROM genericevent WHERE event = 'chat_message' AND eventDate::date = i::date AND extra1 = 'public' AND extra2 = 'clean') AS message_public_clean
FROM generate_series('2013-08-01', '2013-08-27', INTERVAL '1 day') i

I have an index which I as a human consider fully usable for this query (in fact it should result in an index-only scan): 我有一个索引作为人类认为完全可用于此查询(实际上它应该导致仅索引扫描):

CREATE INDEX idx__genericevent__event__extra1__date
  ON genericevent
  USING btree
  (event COLLATE pg_catalog."default", extra1 COLLATE pg_catalog."default", eventDate);

However, as EXPLAIN ed, PostgreSQL doesn't deem it so. 但是,正如EXPLAIN编辑的那样,PostgreSQL并不这么认为。 It uses event and extra1 from this index, but not eventDate (see the Index Cond lines): 它使用来自此索引的eventextra1 ,但不使用eventDate (请参阅Index Cond行):

"Function Scan on generate_series i  (cost=0.00..145219698.17 rows=1000 width=8)"
"  SubPlan 1"
"    ->  Aggregate  (cost=72274.87..72274.88 rows=1 width=0)"
"          ->  Bitmap Heap Scan on genericevent  (cost=11367.74..72271.51 rows=1345 width=0)"
"                Recheck Cond: (((event)::text = 'chat_message'::text) AND ((extra1)::text = 'public'::text))"
"                Filter: ((eventDate)::date = (i.i)::date)"
"                ->  Bitmap Index Scan on idx__genericevent__event__extra1__date  (cost=0.00..11367.40 rows=269012 width=0)"
"                      Index Cond: (((event)::text = 'chat_message'::text) AND ((extra1)::text = 'public'::text))"
"  SubPlan 2"
"    ->  Aggregate  (cost=72944.79..72944.80 rows=1 width=0)"
"          ->  Bitmap Heap Scan on genericevent  (cost=11367.50..72943.80 rows=396 width=0)"
"                Recheck Cond: (((event)::text = 'chat_message'::text) AND ((extra1)::text = 'public'::text))"
"                Filter: (((extra2)::text = 'clean'::text) AND ((eventDate)::date = (i.i)::date))"
"                ->  Bitmap Index Scan on idx__genericevent__event__extra1__date  (cost=0.00..11367.40 rows=269012 width=0)"
"                      Index Cond: (((event)::text = 'chat_message'::text) AND ((extra1)::text = 'public'::text))"

I think it may have to do something by the eventDate::date cast. 我认为它可能必须通过eventDate::date cast做一些事情。 How can I change the query or the index to improve performance? 如何更改查询或索引以提高性能?

For completeness, here's the table: 为了完整起见,这是表:

CREATE TABLE genericevent
(
  id bigint NOT NULL,
  eventDate timestamp with time zone NOT NULL,
  event character varying(50) NOT NULL,
  extra1 character varying(100),
  extra2 character varying(100),
  CONSTRAINT genericevent_pkey PRIMARY KEY (id)
)

You need to use timestamps for it to work, rather than dates. 您需要使用时间戳来工作,而不是日期。

On paper, you could change the index to an expression so it is date truncated to the specified date. 在纸面上,您可以将索引更改为表达式,以便将日期截断为指定日期。 But this won't work if the time stamp has a time zone, since it's then volatile due to the theoretical potential for the server's timezone to change. 但是,如果时间戳有时区,这将无法工作,因为由于服务器时区发生变化的理论可能性,它会变得不稳定。

In practice, you'd need to change the equality clause to an equivalent inequality, eg something like: 在实践中,您需要将等式子句更改为等效的不等式,例如:

eventDate >= i and eventDate < i + interval '1 day'

But before proceeding with rewriting the query, note that you could simply add the appropriate where clauses to Clodoaldo Neto's query: 但在继续重写查询之前,请注意您只需在Clodoaldo Neto的查询中添加适当的where子句:

select
    i::date as day,
    count(*) as message_public_total,
    count(extra2 = 'clean' or null) as message_public_clean
from
    genericevent
    right join
    generate_series(
        '2013-08-01', '2013-08-27', interval '1 day'
    ) i on eventdate::date = i::date
where
    event = 'chat_message'
    and extra1 = 'public'
    and eventDate >= '2013-08-01'
    and eventDate < '2013-08-27' + interval '1 day'
group by 1

Or: 要么:

select
    i::date as day,
    count(*) as message_public_total,
    count(extra2 = 'clean' or null) as message_public_clean
from
    genericevent
    right join
    generate_series(
        '2013-08-01', '2013-08-27', interval '1 day'
    ) i on eventdate >= i and eventDate < i + interval '1 day'
where
    event = 'chat_message'
    and extra1 = 'public'
--    and eventDate >= '2013-08-01'
--    and eventDate < '2013-08-27' + interval '1 day'
group by 1

This equivalent query does one only scan in instead of two like in yours. 这个等效的查询只会扫描一个,而不是像你的那样扫描。

select
    i::date as day,
    count(*) as message_public_total,
    count(extra2 = 'clean' or null) as message_public_clean
from
    genericevent
    right join
    generate_series(
        '2013-08-01', '2013-08-27', interval '1 day'
    ) i on eventdate::date = i::date
where
    event = 'chat_message'
    and extra1 = 'public'
group by 1

Then the index would be 那么指数就是

create index idx on genericevent (
    eventDate::date,
    event,
    extra1   
)

I placed the date first as I guess it has the highest cardinality . 我把日期放在第一位,因为我猜它有最高的基数

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM