简体   繁体   English

Mysql总和基于包含多个LEFT JOIN的其他列

[英]Mysql sum distinct based on other columns containing multiple LEFT JOIN

Ive got 5 tables that I'd like to LEFT JOIN together. 我有5张桌子,我想一起离开加入。 Tables are : visitors, offers, contracts1, contracts2 and contracts3. 表格包括:访客,优惠,合同1,合同2和合同3。

QUERY: 查询:

SELECT 
        count(DISTINCT visitors.ID) as visitors, 
        sum(
        CASE
        WHEN offers.ACTIVE = 1 THEN 1
        ELSE 0
        END) as offers, 
        count(contracts1.ID) as contracts1, sum(contracts1.PRICE) as sum_contracts1, 
        count(contracts2.ID) contracts2, 
        sum(
        CASE
        WHEN contracts2.PAYMENT = 'YEARLY' THEN contracts2.PRICE
        WHEN contracts2.PAYMENT = 'TWICE' THEN contracts2.PRICE*2
        ELSE contracts2.PRICE*4
        END) as sum_contracts2,
        count(contracts3.ID) as contracts3, sum(contracts3.PRICE) as sum_contracts3
        FROM visitors 
        LEFT JOIN offersON offers.VISITOR_ID = visitors.ID AND (offers.IP > 100 OR offers.IP < 0)
        LEFT JOIN contracts1 ON 
        (offers.ID = contracts1.ID_OFFER)
        LEFT JOIN contracts2 ON 
        (offers.ID = contracts2.ID_OFFER)
        LEFT JOIN contracts3 ON 
        (offers.ID = contracts3.ID_OFFER)
        WHERE  visitors.TIME >= '2017-01-01 00:00:00' AND visitors.TIME <= '2017-05-25 23:59:59'

Problem here is, that contracts1, contracts2 and contracts3 have no common column in order to be joined together. 这里的问题是,contract1,contract2和contract3没有公共列以便连接在一起。 So instead of 20 rows for contracts1, 30 for contracs2 and 50 for contracts3 i get all the combination for all of them. 因此,对于contract1,对于contracs2为30而对于contract3为50,而不是20行,我得到所有这些的所有组合。 Because they are joined based on visitors and offers tables. 因为它们是基于访问者加入并提供表格。 Simple GROUP BY in the end of the query would normally solve the problem, but if I use GROUP BY in the END for one of those tables (or all of them), it will create MULTIPLE ROWS instead of 1 that I want. 在查询结束时简单的GROUP BY通常会解决问题,但是如果我在END中为其中一个表(或所有表)使用GROUP BY,它将创建MULTIPLE ROWS而不是我想要的1。 And also it would erase all the other result for the part where i count visitors by ID and also offers by ID ... I can use DISTINCT on count() parts of the SELECT but not one the sum() because PRICE of the contracts may be same even though IDs are not (you know like for example 2 chocolates are 2 rows with different IDs but same PRICE for 10 dollars each). 并且它将擦除我通过ID计算访问者的部分​​的所有其他结果,并且还通过ID提供...我可以在SELECT的count()部分使用DISTINCT而不是sum()因为PRICE的合同可能是相同的,即使ID不是(你知道例如2个巧克力是2行具有不同的ID但是相同的PRICE,每个10美元)。

So my question is: 所以我的问题是:

Is there any way to SUM only those PRICES of contracts1, contracts2 and contracts3, that have DISTINCT ID and though get rid of adding up the duplicates? 有没有办法只汇总那些具有DISTINCT ID的Contract1,contract2和contract3的价格,尽管可以摆脱重复的混合? And is it possible without creating VIEW? 是否可以在不创建VIEW的情况下实现?

I also tried GROUP BY inside of the LEFT JOIN but again when i LEFT JOINED all 3 contracts tables together, even though i GROUPED them before I ended up with duplicates. 我也在LEFT JOIN中尝试了GROUP BY,但是当我将所有3个合同表连接在一起时,即使我在我最终复制之前对它们进行了分组。

Example of expected result: 预期结果示例:

In that time horizon which I stated above I would expect: 80 visitors that have 35 offers and 5 contracts1 with sum of 1000 euros, 12 contracts2 with sum of 686 euros and 3 contracts3 with sum of 12 euros. 在我上面提到的那个时间范围内,我预计:80个访客有35个优惠,5个合同1总和1000欧元,12个合同2,总和686欧元和3个合同3,总和12欧元。 It is ONE ROW with 8 columns of data. 它是一行,有8列数据。

Instead of expected result I got: 80 visitors, 35 offers, 180 contracts1 (sum is also bad), 180 contracts2 (sum is also bad), 180 contracts3 (sum is also bad). 而不是预期的结果我得到:80个访客,35个优惠,180个合约1(总和也差),180个合约2(总和也差),180个合约3(总和也差)。

With CTEs ( Supported by MariaDB 10.2.1 ) I would write something like this: 使用CTE( 由MariaDB 10.2.1支持 ),我会写这样的东西:

WITH v AS (
    SELECT ID as VISITOR_ID
    FROM visitors 
    WHERE visitors.TIME >= '2017-01-01 00:00:00'
      AND visitors.TIME <= '2017-05-25 23:59:59'
), o AS (
    SELECT offers.ID as ID_OFFER
    FROM v
    JOIN offers USING(VISITOR_ID)
    WHERE offers.ACTIVE = 1
      AND (offers.IP > 100 OR offers.IP < 0)
), c1 AS (
    SELECT count(*) as contracts1, sum(contracts1.PRICE) as sum_contracts1
    FROM o JOIN contracts1 USING(ID_OFFER)
), c2 AS (
    SELECT
        count(*) contracts2, 
        sum(CASE contracts2.PAYMENT
            WHEN 'YEARLY' THEN contracts2.PRICE
            WHEN 'TWICE'  THEN contracts2.PRICE*2
            ELSE contracts2.PRICE*4
        END) as sum_contracts2
    FROM o JOIN contracts2 USING(ID_OFFER)
), c3 AS (
    SELECT count(*) as contracts3, sum(contracts3.PRICE) as sum_contracts3
    FROM o JOIN contracts3 USING(ID_OFFER)
)
    SELECT c1.*, c2.*, c3.*,
        (SELECT count(*) FROM v) as visitors,
        (SELECT count(*) FROM o) as offers,
    FROM c1, c2, c3;

Without CTEs you can rewrite it to use temporary tables: 没有CTE,您可以重写它以使用临时表:

CREATE TEMPORARY TABLE v AS
    SELECT ID as VISITOR_ID
    FROM visitors 
    WHERE visitors.TIME >= '2017-01-01 00:00:00'
      AND visitors.TIME <= '2017-05-25 23:59:59';

CREATE TEMPORARY TABLE o AS
    SELECT offers.ID as ID_OFFER
    FROM v
    JOIN offers USING(VISITOR_ID)
    WHERE offers.ACTIVE = 1
      AND (offers.IP > 100 OR offers.IP < 0);

CREATE TEMPORARY TABLE c1 AS
    SELECT count(*) as contracts1, sum(contracts1.PRICE) as sum_contracts1
    FROM o JOIN contracts1 USING(ID_OFFER);

CREATE TEMPORARY TABLE c2 AS
    SELECT
        count(*) contracts2, 
        sum(CASE contracts2.PAYMENT
            WHEN 'YEARLY' THEN contracts2.PRICE
            WHEN 'TWICE'  THEN contracts2.PRICE*2
            ELSE contracts2.PRICE*4
        END) as sum_contracts2
    FROM o JOIN contracts2 USING(ID_OFFER);

CREATE TEMPORARY TABLE c3 AS
    SELECT count(*) as contracts3, sum(contracts3.PRICE) as sum_contracts3
    FROM o JOIN contracts3 USING(ID_OFFER);

SELECT c1.*, c2.*, c3.*,
    (SELECT count(*) FROM v) as visitors,
    (SELECT count(*) FROM o) as offers,
FROM c1, c2, c3;

Just a proof of concept where I don't account for the time and activity constraints as well as the payment type, but couldn't it be something along those lines? 只是一个概念证明,我没有考虑时间和活动的限制以及支付类型,但它不能成为这些方面的东西吗?

SELECT
   VISITOR_ID,
   SUM(CASE WHEN TYPE="contract1" THEN 1 else 0 END) as c1_count,
   SUM(CASE WHEN TYPE="contract1" THEN PRICE else 0 END) as c1_total_price,
   SUM(CASE WHEN TYPE="contract2" THEN 1 else 0 END) as c2_count,
   SUM(CASE WHEN TYPE="contract2" THEN PRICE else 0 END) as c2_total_price,
   SUM(CASE WHEN TYPE="contract3" THEN 1 else 0 END) as c3_count,
   SUM(CASE WHEN TYPE="contract3" THEN PRICE else 0 END) as c3_total_price 
FROM (
    (SELECT "contract1" as TYPE, ID, PRICE, ID_OFFER, PAYMENT FROM contracts1) 
    UNION
    (SELECT "contract2" as TYPE, ID, PRICE, ID_OFFER, PAYMENT FROM contracts2)
    UNION
    (SELECT "contract3" as TYPE, ID, PRICE, ID_OFFER, PAYMENT FROM contracts3)
 ) as all_contracts 
 JOIN offers on offers.id = all_contracts.ID_OFFER
 JOIN visitors on visitors.ID = offers.VISITOR_ID
 GROUP BY visitors.ID

The idea is that first you merge the different contracts into one result where you store their type in a column called "TYPE" (that's the purpose of the UNION queries) and once you have such a nice table where each contract is exactly once, you can get your desired result quite straightforward. 我们的想法是首先将不同的合同合并到一个结果中,在这个结果中将它们的类型存储在一个名为“TYPE”的列中(这是UNION查询的目的),一旦你有一个很好的表,每个合约只有一次,你可以很简单地得到你想要的结果。 I just outlined how you get the sum and count for each type of contract. 我刚刚概述了如何获得每种合同的总和和数量。 Of course, the final query would be a bit more complicated but the core idea should be the same. 当然,最终的查询会有点复杂,但核心思想应该是一样的。

But despite your statement that you don't want to use (temporary) views, I would encourage you to try it - I have a feeling that putting those "all_contracts" joined with offers and visitors into a temporary view would improve the performance, if that's your concern, without making the query too ugly, mainly in the case when you would want to see the stats just for one visitor or to filter them further (by time, activity and so on), because unnecessary rows won't be materialized. 但是,尽管您声明您不想使用(临时)视图,但我鼓励您尝试一下 - 我觉得将那些“all_contracts”与优惠和访问者一起加入临时视图会改善性能,如果这是你的顾虑,而不是让查询过于丑陋,主要是在你想要查看一个访问者的统计数据或进一步过滤它们(按时间,活动等)的情况下,因为不必要的行将不会实现。 But that's just an impression since I haven't tried the query on a bigger data set - you can play with it. 但这只是一个印象,因为我没有在更大的数据集上尝试查询 - 你可以玩它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM