[英]MySQL Query: Aggregation at two different level
I have two tables 我有两张桌子
mysql> select * from report;
+----+----------+------------+------------------+-------------+
| id | campaign | advertiser | impression_count | click_count |
+----+----------+------------+------------------+-------------+
| 1 | camp1 | adv1 | 20 | 6 |
| 2 | camp2 | adv2 | 10 | 2 |
| 3 | camp1 | adv1 | 15 | 3 |
| 4 | camp2 | adv2 | 6 | 1 |
+----+----------+------------+------------------+-------------+
4 rows in set (0.00 sec)
mysql> select * from device;
+-----------+-----------+
| report_id | device_id |
+-----------+-----------+
| 1 | d1 |
| 1 | d2 |
| 2 | d1 |
| 2 | d3 |
| 2 | d4 |
| 3 | d2 |
| 3 | d4 |
| 4 | d3 |
| 4 | d4 |
| 4 | d5 |
+-----------+-----------+
10 rows in set (0.00 sec)
I want report which is aggregated at campaign and advertiser level which has sum of impression and click count and distinct device_ids. 我想要在广告系列和广告客户级别汇总的报表,其中包含展示次数和点击次数以及不同的device_ids。 So I wrote below query
所以我写下面的查询
SELECT
campaign,
advertiser,
sum(impression_count),
sum(click_count),
count(DISTINCT device_id)
FROM report
LEFT JOIN device ON report.id = device.report_id
GROUP BY campaign, advertiser;
+----------+------------+-----------------------+------------------+---------------------------+
| campaign | advertiser | sum(impression_count) | sum(click_count) | count(distinct device_id) |
+----------+------------+-----------------------+------------------+---------------------------+
| camp1 | adv1 | 70 | 18 | 3 |
| camp2 | adv2 | 48 | 9 | 4 |
+----------+------------+-----------------------+------------------+---------------------------+
Here because of join impression count and click_count is aggregated for multiple rows. 这是因为联合展示次数和click_count聚合为多行。 What is want is
想要的是什么
+----------+------------+-----------------------+------------------+---------------------------+
| campaign | advertiser | sum(impression_count) | sum(click_count) | count(distinct device_id) |
+----------+------------+-----------------------+------------------+---------------------------+
| camp1 | adv1 | 35 | 9 | 3 |
| camp2 | adv2 | 16 | 3 | 4 |
+----------+------------+-----------------------+------------------+---------------------------+
http://sqlfiddle.com/#!2/05dd9d/1 http://sqlfiddle.com/#!2/05dd9d/1
Found not so good solution 发现没那么好的解决方案
select campaign,advertiser,ic,cc,count(distinct device_id)
from (
select
group_concat(id) as id,
sum(impression_count)as ic,
sum(click_count)as cc,
campaign,advertiser
FROM report har GROUP BY campaign,advertiser) a
LEFT JOIN device dr ON FIND_IN_SET(dr.report_id, a.id)
group by a.id
);
But this uses group concat so may have problems if the lenght of group_concat result is large. 但是这会使用group concat,因此如果group_concat结果的长度很大,则可能会出现问题。
What you want to do is do two distinct queries, then join the resulting sets. 你想要做的是做两个不同的查询,然后加入结果集。 The outer select is just to select the information we actually want, and to join the two temporary tables on a common value.
外部选择只是为了选择我们真正想要的信息,并将两个临时表连接到一个公共值上。 You could do this with id and report_id too if you didn't want to select the distinct devices that are in the device table for an entire campaign.
如果您不想为整个广告系列选择设备表中的不同设备,也可以使用id和report_id执行此操作。
select `firsttable`.campaign, `firsttable`.advertiser, a, b, c from
(select id, campaign, advertiser, sum(impression_count) as a, sum(click_count) as b
from report
group by campaign, advertiser
) as firsttable
left join
(select campaign, advertiser, count(distinct device_id) as c
from device, report
where id=report_id
group by campaign, advertiser
) as secondtable on `firsttable`.campaign=`secondtable`.campaign and
`firsttable`.advertiser=`secondtable`.advertiser;
SQLFiddle: http://sqlfiddle.com/#!2/8bd63/20 SQLFiddle: http ://sqlfiddle.com/#!2/8bd63/20
This query is a combination of these two temporary tables: 此查询是这两个临时表的组合:
| ID | CAMPAIGN | ADVERTISER | A | B |
|----|----------|------------|-----|-----|
| 1 | camp1 | adv1 | 35 | 9 |
| 5 | camp1 | adv2 | 900 | 900 |
| 2 | camp2 | adv2 | 16 | 3 |
| CAMPAIGN | ADVERTISER | C |
|----------|------------|---|
| camp1 | adv1 | 3 |
| camp2 | adv2 | 4 |
Result: 结果:
| CAMPAIGN | ADVERTISER | A | B | C |
|----------|------------|-----|-----|--------|
| camp1 | adv1 | 35 | 9 | 3 |
| camp1 | adv2 | 900 | 900 | (null) |
| camp2 | adv2 | 16 | 3 | 4 |
The problem with your query was that it would duplicate rows when combining the report table with the device table. 您的查询的问题是,在将报表与设备表组合时,它会复制行。 You would end up with something like this:
你会得到这样的东西:
| CAMPAIGN | ADVERTISER | IMPRESSION_COUNT | CLICK_COUNT | DEVICE_ID |
|----------|------------|------------------|-------------|-----------|
| camp1 | adv1 | 20 | 6 | d1 |
| camp1 | adv1 | 20 | 6 | d2 |
| camp2 | adv2 | 10 | 2 | d1 |
| camp2 | adv2 | 10 | 2 | d3 |
| camp2 | adv2 | 10 | 2 | d4 |
| camp1 | adv1 | 15 | 3 | d2 |
| camp1 | adv1 | 15 | 3 | d4 |
| camp2 | adv2 | 6 | 1 | d3 |
| camp2 | adv2 | 6 | 1 | d4 |
| camp2 | adv2 | 6 | 1 | d5 |
| camp1 | adv2 | 900 | 900 | (null) |
perhaps this helps you: 也许这有助于你:
SELECT
campaign,
advertiser,
SUM(impression_count) AS ic,
sum(click_count) as cc,
(select
count(distinct device_id)
from
device
where
report_id = id) AS DD
from
report
group by campaign , advertiser;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.