简体   繁体   English

使用SUM()在3个表上进行INNER JOIN

[英]INNER JOIN on 3 tables with SUM()

I am having a problem trying to JOIN across a total of three tables: 我在尝试在总共三个表中进行联接时遇到问题:

  • Table users: userid, cap (ADSL bandwidth) 表用户:用户ID,上限(ADSL带宽)
  • Table accounting: userid, sessiondate, used bandwidth 表计费:用户标识,会话日期,已用带宽
  • Table adhoc: userid, date, amount purchased 临时表:用户名,日期,购买金额

I want to have 1 query that returns a set of all users, their cap, their used bandwidth for this month and their adhoc purchases for this month: 我想要一个查询,该查询返回一组所有用户,其上限,本月使用的带宽以及本月的临时购买:

< TABLE 1 ><TABLE2><TABLE3>
User   | Cap | Adhoc | Used
marius | 3   | 1     | 3.34
bob    | 1   | 2     | 1.15
(simplified)

Here is the query I am working on: 这是我正在处理的查询:

SELECT
        `msi_adsl`.`id`,
        `msi_adsl`.`username`,
        `msi_adsl`.`realm`,
        `msi_adsl`.`cap_size` AS cap,
        SUM(`adsl_adhoc`.`value`) AS adhoc,
        SUM(`radacct`.`AcctInputOctets` + `radacct`.`AcctOutputOctets`) AS used
FROM
        `msi_adsl`
INNER JOIN
        (`radacct`, `adsl_adhoc`)
ON
        (CONCAT(`msi_adsl`.`username`,'@',`msi_adsl`.`realm`) 
           = `radacct`.`UserName` AND `msi_adsl`.`id`=`adsl_adhoc`.`id`)

WHERE
        `canceled` = '0000-00-00'
AND
        `radacct`.`AcctStartTime`
BETWEEN
        '2010-11-01'
AND
        '2010-11-31'
AND
        `adsl_adhoc`.`time`
BETWEEN
        '2010-11-01 00:00:00'
AND
        '2010-11-31 00:00:00'
GROUP BY
        `radacct`.`UserName`, `adsl_adhoc`.`id` LIMIT 10

The query works, but it returns wrong values for both adhoc and used; 该查询有效,但是它会为adhoc和used返回错误的值; my guess would be a logical error in my joins, but I can't see it. 我的猜测是联接中的逻辑错误,但我看不到它。 Any help is very much appreciated. 很感谢任何形式的帮助。

Your query layout is too spread out for my taste. 您的查询布局太分散了,无法满足我的口味。 In particular, the BETWEEN/AND conditions should be on 1 line each, not 5 lines each. 特别是,BETWEEN / AND条件应分别位于1行,而不是5行。 I've also removed the backticks, though you might need them for the 'time' column. 我还删除了反引号,尽管“时间”列可能需要它们。

Since your table layouts don't match your sample query, it makes life very difficult. 由于您的表布局与示例查询不匹配,因此使工作变得非常困难。 However, the table layouts all include a UserID (which is sensible), so I've written the query to do the relevant joins using the UserID. 但是,表布局都包含一个UserID(这是明智的),因此我编写了查询以使用UserID进行相关的联接。 As I noted in a comment, if your design makes it necessary to use a CONCAT operation to join two tables, then you have a recipe for a performance disaster. 正如我在评论中指出的那样,如果您的设计有必要使用CONCAT操作来连接两个表,那么您将有可能导致性能下降。 Update your actual schema so that the tables can be joined by UserID, as your table layouts suggest should be possible. 更新您的实际架构,以便表可以通过UserID进行联接,因为表布局建议应该可行。 Obviously, you can use functions results in joins, but (unless your DBMS supports 'functional indexes' and you create appropriate indexes) the DBMS won't be able to use indexes on the table where the function is evaluated to speed the queries. 显然,您可以在联接中使用函数结果,但是(除非您的DBMS支持“函数索引”,并且您创建了适当的索引),否则DBMS将无法使用对函数进行评估的表上的索引来加快查询速度。 For a one-off query, that may not matter; 对于一次性查询,可能没有关系; for production queries, it often does matter a lot. 对于生产查询,这通常很重要。

There's a chance this will do the job you want. 有机会完成您想要的工作。 Since you are aggregating over two tables, you need the two sub-queries in the FROM clause. 由于要聚合两个表,因此需要FROM子句中的两个子查询。

SELECT u.UserID,
       u.username,
       u.realm,
       u.cap_size AS cap,
       h.AdHoc,
       a.OctetsUsed
  FROM msi_adsl AS u
  JOIN (SELECT UserID, SUM(AcctInputOctets + AcctOutputOctets) AS OctetsUsed
          FROM radact
         WHERE AcctStartTime BETWEEN '2010-11-01' AND '2010-11-31'
         GROUP BY UserID
       )    AS a ON a.UserID = u.UserID
  JOIN (SELECT UserID, SUM(Value) AS AdHoc
          FROM adsl_adhoc
         WHERE time BETWEEN '2010-11-01 00:00:00' AND '2010-11-31 00:00:00'
         GROUP BY UserId
       )    AS h ON h.UserID = u.UserID
 WHERE u.canceled = '0000-00-00'
 LIMIT 10

Each sub-query computes the value of the aggregate for each user over the specified period, generating the UserID and the aggregate value as output columns; 每个子查询都会计算指定时间段内每个用户的汇总值,并生成UserID和汇总值作为输出列; the main query then simply pulls the correct user data from the main user table and joins with the aggregate sub-queries. 然后,主查询仅从主用户表中提取正确的用户数据,并与聚合子查询联接。

I think that the problem is here 我认为问题出在这里

FROM  `msi_adsl`
INNER JOIN
        (`radacct`, `adsl_adhoc`)
ON
        (CONCAT(`msi_adsl`.`username`,'@',`msi_adsl`.`realm`)
           = `radacct`.`UserName` AND `msi_adsl`.`id`=`adsl_adhoc`.`id`)

You are mixing joins with Cartesian product, and this is not good idea, because it's a lot harder to debug. 您正在将联接与笛卡尔积相混合,但这不是一个好主意,因为它很难调试。 Try this: 尝试这个:

FROM  `msi_adsl`
INNER JOIN
        `radacct`
ON
      CONCAT(`msi_adsl`.`username`,'@',`msi_adsl`.`realm`) = `radacct`.`UserName`
JOIN  `adsl_adhoc` ON  `msi_adsl`.`id`=`adsl_adhoc`.`id`

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM