简体   繁体   中英

How to join two tables based on a calculated field?

I have two SQL queries that output the same kind of output and have the same grouping and order:

select date_trunc('month', inserted_at)::date as date, count(id) from payment_logs where payment_logs.event_name = 'subscription_created' group by date order by date desc;

在此处输入图像描述

select date_trunc('month', inserted_at)::date as date, count(id) from users group by date order by date desc;

在此处输入图像描述

I would like to join those two results based on the calculated date field (which is the month), and have a result with 3 columns: date, count_users and count_payment_logs.

How can I achieve that? Thanks.

Something like this

select plog.date as odata, usr.cntusr, plog.cntlog 
from (
    select date_trunc('month', inserted_at)::date as date, count(id) cntlog
    from payment_logs 
    where payment_logs.event_name = 'subscription_created' 
    group by date order by date desc
) plog
join (
    select date_trunc('month', inserted_at)::date as date, count(id) cntusr
    from users 
    group by date 
) usr on plog.data = usr.data
order by odata desc

Nothing wrong with the accepted answer, but I wanted to show an alternative and add some color. Instead of subqueries, you can also use common table expressions (CTEs) which improve readability but also have some other features as well. Here is an example using CTEs:

with payments as (
  select
    date_trunc('month', inserted_at)::date as date,
    count(id) as payment_count
  from payment_logs
  where
    event_name = 'subscription_created'
  group by date
),
users as (
  select
    date_trunc('month', inserted_at)::date as date,
    count(id) as user_count
  from users
  group by date
)
select
  p.date, p.payment_count, u.user_count
from
  payments p
  join users u on
    p.date = u.date
order by
  p.date desc

In my opinion the abstraction is neater and makes the code much easier to follow (and thus maintain).

Other notes:

The order by is expensive, and you can avoid it within each of the subqueries/CTEs since it's being done at the end anyway. The ones in the subqueries will be clobbered by whatever you do in the main query anyway, so just omit them completely. Your results will not differ, and your query will be more efficient.

In this example, you probably don't have any missing months, but it's possible... especially if you expand this concept to future queries. In such a case, you may want to consider a full outer join instead of an inner join (you have months that appear in the users that may not be in the payments or vice versa):

select
  coalesce (p.date, u.date) as date,
  p.payment_count, u.user_count
from
  payments p
  full outer join users u on
    p.date = u.date
order by
  1 desc

Another benefit of CTEs vs subqueries is that you can reuse them. In this example, I want to mimic the full outer join concept but with one additional twist -- I have data from another table by month that I want in the query. The CTE lets me use the CTE for "payments" and "users" as many times as I want. Here I use them in the all_dates CTE and again in the main query. By creating "all_dates" I can now use left joins and avoid weird coalescing in joins (not wrong, just ugly).

with payments as (
  -- same as above
),
users as (
  -- same as above
),
all_dates as (
  select date from payments  -- referred to payments here
  union
  select date from users
)
select
  a.date, ac.days_in_month, p.payment_count, u.user_count
from
  all_dates a
  join accounting_calendar ac on
    a.date = ac.accounting_month
  left join payments p on -- referred to it here again, same CTE
    a.date = p.date
  left join users u on
    a.date = u.date
order by
  p.date desc

The point is you can reuse the CTEs.

A final advantage is that you can declare the CTE materialized or non-materialized (default). The materialized CTE will essentially pre-process and store the results, which in certain cases may have better performance. A non-materialized on, on the other hand, will mimic a standard subquery which is nice because you can pass where clause conditions from outside the query to inside of it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM