[英]Optimizing Sqlite query: grouping in subqueries
我有一个非常简单的Sqlite模式,用于按用户操作记录每日计数,并按天和操作记录各种用户操作延迟百分比:
create table user_actions (
id integer primary key,
name text not null
)
create table action_date_count (
action_id integer not null
references user_actions(id) on delete restrict on update restrict,
date integer not null,
count integer not null,
unique (action_id, date) on conflict fail
)
create table latency_percentiles (
action_id integer not null
references user_actions(id) on delete restrict on update restrict,
date integer not null,
percentile integer not null,
value real not null,
unique (action_id, date, percentile) on conflict fail
)
这里所有日期都存储为每天午夜的Unix时间戳(如果有帮助,我可以更改)。
现在,这是一个我在苦苦挣扎的查询:显示上周按平均交易量降序排列的操作,包括平均延迟百分位分别为50%,90%和95%。 我提出了一个庞大的查询,说明计划说需要17个步骤,而且非常慢。 有人可以改善吗?
select ua.id, ua.name, ac.avg_count, al50.avg_lat_50, al90.avg_lat_90, al95.avg_lat_95
from
user_actions as ua,
(
select adc.action_id as action_id, avg(adc.count) as avg_count
from
action_date_count as adc,
(select max(date) as max_date from action_date_count) as md
where
julianday(md.max_date, 'unixepoch', 'localtime') - julianday(adc.date, 'unixepoch', 'localtime') between 1 and 7
group by action_id
) as ac,
(
select lp.action_id as action_id, avg(lp.value) as avg_lat_50
from
latency_percentiles as lp,
(select max(date) as max_date from action_date_count) as md
where
lp.percentile = 50 and
julianday(md.max_date, 'unixepoch', 'localtime') - julianday(lp.date, 'unixepoch', 'localtime') between 1 and 7
group by action_id
) as al50,
(
select lp.action_id as action_id, avg(lp.value) as avg_lat_90
from
latency_percentiles as lp,
(select max(date) as max_date from action_date_count) as md
where
lp.percentile = 90 and
julianday(md.max_date, 'unixepoch', 'localtime') - julianday(lp.date, 'unixepoch', 'localtime') between 1 and 7
group by action_id
) as al90,
(
select lp.action_id as action_id, avg(lp.value) as avg_lat_95
from
latency_percentiles as lp,
(select max(date) as max_date from action_date_count) as md
where
lp.percentile = 95 and
julianday(md.max_date, 'unixepoch', 'localtime') - julianday(lp.date, 'unixepoch', 'localtime') between 1 and 7
group by action_id
) as al95
where ua.id = ac.action_id and ua.id = al50.action_id and ua.id = al90.action_id and ua.id = al95.action_id
order by ac.avg_count desc;
我假设您已经为action_date_count
和latency_percentiles
时间表中的date
列建立了索引。
那么问题是,sqlite无法使用给定您提供的查询的日期索引。 您可以通过调整日期比较来解决此问题。
代替这个:
julianday(md.max_date, 'unixepoch', 'localtime') - julianday(lp.date, 'unixepoch', 'localtime') between 1 and 7
做这个:
lp.date between md.max_date - 7 * 24 * 3600 and md.max_date
通过在latency_percentiles (date, percentile, value)
上创建覆盖索引,您也可能会获得良好的结果。 因人而异。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.