简体   繁体   English

如何在 MySQL 中使用 JOIN SQLs 选择没有数据的所有日期?

[英]How to select all dates without data using JOIN SQLs in MySQL?

I have 3 tables, users, sites, and site_traffic, respectively.我有 3 个表,分别是用户、站点和站点流量。 The users table contains the name of the user and other details about the user. users 表包含用户的名称和有关用户的其他详细信息。 Each user has 1 or more sites which is stored in the sites table.每个用户都有 1 个或多个站点,这些站点存储在站点表中。 Now every site has its own traffic data.现在每个站点都有自己的流量数据。

What I am trying to accomplish to select all the dates that has no traffic data for each site for all users.我想要完成的是为所有用户选择每个站点没有流量数据的所有日期。 This should display all the user's names, the site_ids of each user and the date that has no data for each of those sites.这应该显示所有用户的名称、每个用户的 site_ids 以及没有这些站点中的每一个的数据的日期。

As of this query I am able to get the dates that have no data just for 1 specific user.在此查询中,我能够获取仅针对 1 个特定用户没有数据的日期。 How do I modify this query to list all the users and their sites and the dates that have no data for each site.如何修改此查询以列出所有用户及其站点以及每个站点没有数据的日期。

Here's my query:这是我的查询:

SELECT b.dates_without_data
FROM (
    SELECT a.dates AS dates_without_data
    FROM (
        SELECT CURDATE() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY as dates
        FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as a
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as b
        CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as c
    ) a
    WHERE a.dates >= DATE_SUB(DATE_SUB(NOW(),INTERVAL 1 DAY), INTERVAL 35 DAY)
) b
WHERE b.dates_without_data NOT IN (
    SELECT recorded_on 
    FROM site_traffic, sites, users
    WHERE site_traffic.site_id = sites.site_id
    AND sites.user_id = users.user_id
    AND users.user_id = 1
)
AND b.dates_without_data < DATE_SUB(NOW(),INTERVAL 1 DAY)
ORDER BY b.dates_without_data ASC

Thanks for your help guys.谢谢你们的帮助。

I would use an anti-join pattern.我会使用反连接模式。

First, do a cross join operation between the generated list of possible dates and all sites.首先,在生成的可能日期列表和所有站点之间进行交叉连接操作。 That gives us rows for every site, for every day.这为我们提供了每天每个站点的行。 Then go ahead and do the join to the users table.然后继续连接到用户表。

The trick is the anti-join.诀窍是反连接。 We take that set of all sites and all days, and then "match" to rows in site_traffic.我们获取所有站点和所有日期的集合,然后“匹配”到 site_traffic 中的行。 We just want to return the rows that don't have a match.我们只想返回没有匹配的行。 We can do that with an outer join, and then add a condition in the WHERE clause that excludes a row if it found a match.我们可以使用外连接来做到这一点,然后在 WHERE 子句中添加一个条件,如果找到匹配则排除该行。 Leaving only rows that didn't have a match.只留下没有匹配的行。

Something like this:像这样的东西:

 SELECT s.site_id
      , u.user_id
      , d.dt       AS date_without_data
   FROM (

    SELECT DATE(NOW()) - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS dt
      FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as a
      CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as b
      CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) as c
    HAVING dt >= DATE(NOW()) + INTERVAL -1-35 DAY
       AND dt <  DATE(NOW()) + INTERVAL -1 DAY

        ) d
  CROSS
   JOIN site s
   JOIN users u
     ON u.user_id = s.user_id
  LEFT
  JOIN site_traffic t
    ON t.site_id      = s.site_id
    ON t.recorded_on >= d.dt
   AND t.recorded_on  < d.dt + INTERVAL 1 DAY
 WHERE t.site_id IS NULL

 ORDER BY s.site_id, u.user_id

The trick there is the condition in the WHERE clause.诀窍在于 WHERE 子句中的条件。 Any rows that found matching rows in site_traffic will have a non-NULL value for site_id .该发现匹配的行的任何行site_traffic将有一个非NULL值site_id (The equality comparison to site_id in the join condition guarantees us that.) So if we exclude all rows that have non-NULL values, we are left with the rows that didn't have a match. (连接条件中与site_id的相等比较保证了这一点。)因此,如果我们排除所有具有非 NULL 值的行,我们将剩下没有匹配的行。

(I assumed that recorded_on was a datetime, so I used a range comparison... to match any value of recorded_on within the given date. If recorded_on is actually a date (with no time) then we could just do a simpler equality comparison.) (我假设recorded_on 是一个日期时间,所以我使用了范围比较……来匹配给定日期内recorded_on任何值。如果recorded_on实际上是一个date (没有时间),那么我们可以做一个更简单的相等比较。 )

Add to the SELECT list whatever expressions you need, from the u and s tables.us表中添加您需要的任何表达式到 SELECT 列表。

Some people suggest that the inline view d (to generate a list of "all dates") looks kind of messy.有些人认为内联视图d (生成“所有日期”的列表)看起来有点乱。 But I'm fine with it.但我没问题。

It would be nice if MySQL provided a table valued function, or some other "prettier" mechanism for generating a series of integer values.如果 MySQL 提供了一个表值函数,或者其他一些用于生成一系列整数值的“更漂亮”的机制,那就太好了。

I would include all of the conditions on date within the view query itself, get it done inside the view, and not have to muck with the outer query.我会在视图查询本身中包含日期上的所有条件,在视图中完成它,而不必与外部查询混为一谈。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM