简体   繁体   中英

Joining multiple columns based on aggregate function with SQL

I have 2 tables, table A and table B. I'm trying to return player_id, date, spins, coin_in, and revenue aggregated to Player ID and Date but not sure how to join them together. I'm still pretty new to SQL.

Table A

| player_id |       date | spins | coin_in |
|-----------|------------|-------|---------|
|    252156 | 2020-05-01 |     0 |       0 |
|    252156 | 2020-05-02 |     5 |  100000 |
|    252156 | 2020-05-03 |     1 |   50000 |
|    252156 | 2020-05-04 |   100 | 1000000 |
|    252156 | 2020-05-05 |    10 |  100000 |
|    923451 | 2020-05-04 |    50 | 1000000 |
|    923451 | 2020-05-05 |     5 |  100000 |
Table B

| player_id |       date |             datetime | revenue |
|-----------|------------|----------------------|---------|
|    252156 | 2020-05-01 | 2020-05-01T22:54:59Z |    9.99 |
|    252156 | 2020-05-01 | 2020-05-01T23:54:59Z |   19.99 |
|    252156 | 2020-05-05 | 2020-05-05T20:54:59Z |   49.99 |
|    252156 | 2020-05-05 | 2020-05-05T21:54:59Z |   99.99 |
|    923451 | 2020-05-04 | 2020-05-04T19:54:59Z |    0.99 |

I tried using an inner join but it isn't returning each specific date.

SELECT A.player_id, A.date, A.spins, A.coin_in, SUM(revenue)
FROM A
INNER JOIN B ON B.player_id = A.player_id
GROUP BY A.player_id;
| player_id |       date | spins | coin_in | SUM(revenue) |
|-----------|------------|-------|---------|--------------|
|    252156 | 2020-05-01 |     0 |       0 |        899.8 |
|    923451 | 2020-05-04 |    50 | 1000000 |         1.98 |

The best answer highly depends on the cardinality between the tables.

The safest approach is UNION ALL , which accommodates possible 0 to N records per user and date in each table:

SELECT player_id, date, SUM(spins), SUM(coin_in), SUM(revenue)
FROM (
    SELECT player_id, date, spins, coin_in, null revenue FROM B
    UNION ALL SELECT player_id, date, 0, 0, revenue FROM A
) t
GROUP BY B.player_id, date

If A has all the dates (as shown in your sample data), then a LEFT JOIN would be preferred - but you need to pre-aggregate to avoid multiplying the rows:

SELECT a.*, b.revenue
FROM (
    SELECT player_id, date, SUM(a.spins) spins, SUM(a.coin_in) coin_in
    FROM a
    GROUP BY player_id, date
) a
LEFT JOIN (
    SELECT player_id, date, SUM(b.revenue) revenue 
    FROM b
    GROUP BY player_id, date
) b ON a.player_id = b.player_id AND a.date = b.date

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM