简体   繁体   中英

How to make my query using a self-join faster?

I had originally tried creating two different queries and then merging them in R to get a cumulative time graph but I am trying to just get the information I want in a single query.

Original code:

users <- dbGetQuery(pool, "select id, name
                    from schema.table
                    where (name like '%t%' and name like '%2018%') or
                    (name like '%t%' and name like '%2017%')")
opts <- dbGetQuery(pool, "select id, name, ts
                    from schema.table
                    where name = 'qr_optin'")

all <- merge(users, opts, by = "id")

all <- all %>% 
  mutate(date =  as.Date(all$ts),
         name.x = gsub("t", "", name.x)) %>% 
  group_by(name.x, date) %>% 
  summarise(n = n()) 

Which outputs something like this:

name          date         n 
x          2018-09-09      12
x          2018-09-08      5
y          2018-09-08      4
xy         2018-09-06      8
xy         2018-09-04      9

I'm trying to get the information with at least the two queries joined but I've only ever made it this far and it's insanely slow.

select f1.id, f1.name, f2.ts
from schema.table f1
left join schema.table f2 on f2.id = f1.id
where f2.name = ' qr_optin' and
(f1.name like '%t%' and f1.name like '%2018%') or
(f1.name like '%t%' and f1.name like '%2017%')

Simply run pure SQL in Postgres either for the merge (ie, join) or summarise (ie, aggregregation)

Join level query

select usrs.id, usrs.name, opts.ts
from schema.table as usrs
inner join rvv.fbm as opts 
        on opts.id = usrs.id and opts.name = 'qr_optin'
where (name like '%t%' and name like '%2018%') or
      (name like '%t%' and name like '%2017%')

Aggregation query (with CTE)

with cte as 
  ( 
    select usrs.id, Replace(usrs.name, "t", "") as usr_name, opts.ts
    from schema.table as usrs
    inner join rvv.fbm as opts 
            on opts.id = usrs.id and opts.name = 'qr_optin'
    where (name like '%t%' and name like '%2018%') or
          (name like '%t%' and name like '%2017%')
  )

select cte.usr_name as name, cte.ts as date, count(*) as n
from cte
group by cte.name, cte.ts

Pass either query in R into DBI::dbGetQuery call.

all <- dbGetQuery(pool, "...myquery...")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM