简体   繁体   English

MySQL Query:两个不同级别的聚合

[英]MySQL Query: Aggregation at two different level

I have two tables 我有两张桌子

mysql> select * from report;
+----+----------+------------+------------------+-------------+
| id | campaign | advertiser | impression_count | click_count |
+----+----------+------------+------------------+-------------+
|  1 | camp1    | adv1       |               20 |           6 |
|  2 | camp2    | adv2       |               10 |           2 |
|  3 | camp1    | adv1       |               15 |           3 |
|  4 | camp2    | adv2       |                6 |           1 |
+----+----------+------------+------------------+-------------+
4 rows in set (0.00 sec)

mysql> select * from device;
+-----------+-----------+
| report_id | device_id |
+-----------+-----------+
|         1 | d1        |
|         1 | d2        |
|         2 | d1        |
|         2 | d3        |
|         2 | d4        |
|         3 | d2        |
|         3 | d4        |
|         4 | d3        |
|         4 | d4        |
|         4 | d5        |
+-----------+-----------+
10 rows in set (0.00 sec)

I want report which is aggregated at campaign and advertiser level which has sum of impression and click count and distinct device_ids. 我想要在广告系列和广告客户级别汇总的报表,其中包含展示次数和点击次数以及不同的device_ids。 So I wrote below query 所以我写下面的查询

SELECT 
    campaign,
    advertiser,
    sum(impression_count),
    sum(click_count),
    count(DISTINCT device_id)
FROM report 
LEFT JOIN device ON report.id = device.report_id
GROUP BY campaign, advertiser;
+----------+------------+-----------------------+------------------+---------------------------+
| campaign | advertiser | sum(impression_count) | sum(click_count) | count(distinct device_id) |
+----------+------------+-----------------------+------------------+---------------------------+
| camp1    | adv1       |                    70 |               18 |                         3 |
| camp2    | adv2       |                    48 |                9 |                         4 |
+----------+------------+-----------------------+------------------+---------------------------+

Here because of join impression count and click_count is aggregated for multiple rows. 这是因为联合展示次数和click_count聚合为多行。 What is want is 想要的是什么

+----------+------------+-----------------------+------------------+---------------------------+
| campaign | advertiser | sum(impression_count) | sum(click_count) | count(distinct device_id) |
+----------+------------+-----------------------+------------------+---------------------------+
| camp1    | adv1       |                    35 |               9  |                         3 |
| camp2    | adv2       |                    16 |                3 |                         4 |
+----------+------------+-----------------------+------------------+---------------------------+

http://sqlfiddle.com/#!2/05dd9d/1 http://sqlfiddle.com/#!2/05dd9d/1

Found not so good solution 发现没那么好的解决方案

select campaign,advertiser,ic,cc,count(distinct device_id) 
from (
    select 
        group_concat(id) as id,
        sum(impression_count)as ic,
        sum(click_count)as cc,
        campaign,advertiser 
    FROM report har GROUP BY campaign,advertiser) a 
    LEFT JOIN device dr ON FIND_IN_SET(dr.report_id, a.id) 
    group by a.id
);

But this uses group concat so may have problems if the lenght of group_concat result is large. 但是这会使用group concat,因此如果group_concat结果的长度很大,则可能会出现问题。

What you want to do is do two distinct queries, then join the resulting sets. 你想要做的是做两个不同的查询,然后加入结果集。 The outer select is just to select the information we actually want, and to join the two temporary tables on a common value. 外部选择只是为了选择我们真正想要的信息,并将两个临时表连接到一个公共值上。 You could do this with id and report_id too if you didn't want to select the distinct devices that are in the device table for an entire campaign. 如果您不想为整个广告系列选择设备表中的不同设备,也可以使用id和report_id执行此操作。

select `firsttable`.campaign, `firsttable`.advertiser, a, b, c from 
  (select id, campaign, advertiser, sum(impression_count) as a, sum(click_count) as b
   from report
   group by campaign, advertiser
  ) as firsttable
  left join
  (select campaign, advertiser, count(distinct device_id) as c
   from device, report
   where id=report_id
   group by campaign, advertiser
  ) as secondtable on `firsttable`.campaign=`secondtable`.campaign and
                      `firsttable`.advertiser=`secondtable`.advertiser;

SQLFiddle: http://sqlfiddle.com/#!2/8bd63/20 SQLFiddle: http ://sqlfiddle.com/#!2/8bd63/20

This query is a combination of these two temporary tables: 此查询是这两个临时表的组合:

| ID | CAMPAIGN | ADVERTISER |   A |   B |
|----|----------|------------|-----|-----|
|  1 |    camp1 |       adv1 |  35 |   9 |
|  5 |    camp1 |       adv2 | 900 | 900 |
|  2 |    camp2 |       adv2 |  16 |   3 |

| CAMPAIGN | ADVERTISER | C |
|----------|------------|---|
|    camp1 |       adv1 | 3 |
|    camp2 |       adv2 | 4 |

Result: 结果:

| CAMPAIGN | ADVERTISER |   A |   B |      C |
|----------|------------|-----|-----|--------|
|    camp1 |       adv1 |  35 |   9 |      3 |
|    camp1 |       adv2 | 900 | 900 | (null) |
|    camp2 |       adv2 |  16 |   3 |      4 |

The problem with your query was that it would duplicate rows when combining the report table with the device table. 您的查询的问题是,在将报表与设备表组合时,它会复制行。 You would end up with something like this: 你会得到这样的东西:

| CAMPAIGN | ADVERTISER | IMPRESSION_COUNT | CLICK_COUNT | DEVICE_ID |
|----------|------------|------------------|-------------|-----------|
|    camp1 |       adv1 |               20 |           6 |        d1 |
|    camp1 |       adv1 |               20 |           6 |        d2 |
|    camp2 |       adv2 |               10 |           2 |        d1 |
|    camp2 |       adv2 |               10 |           2 |        d3 |
|    camp2 |       adv2 |               10 |           2 |        d4 |
|    camp1 |       adv1 |               15 |           3 |        d2 |
|    camp1 |       adv1 |               15 |           3 |        d4 |
|    camp2 |       adv2 |                6 |           1 |        d3 |
|    camp2 |       adv2 |                6 |           1 |        d4 |
|    camp2 |       adv2 |                6 |           1 |        d5 |
|    camp1 |       adv2 |              900 |         900 |    (null) |

perhaps this helps you: 也许这有助于你:

SELECT 
    campaign,
    advertiser,
    SUM(impression_count) AS ic,
    sum(click_count) as cc,
    (select 
            count(distinct device_id)
        from
            device
        where
            report_id = id) AS DD
from
    report
group by campaign , advertiser; 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM