繁体   English   中英

mysql 每行将一个表的数据与其他表连接起来

[英]mysql joining one table's data with other tables each row

我试图从下表中生成给定日期范围的报告。

table_columns =>   employee_id |date | status 

其中状态 1 = 未访问,2 = 已访问,3 = 已取消,4 = 待定(待批准) 报告应如下所示:

+-------------+------------+-------+-------------+---------+----------+---------+
| employee_id | date       | total | not_visited | visited | canceled | pending |
+-------------+------------+-------+-------------+---------+----------+---------+
|           3 | 2021-06-01 |    10 |          10 |       0 |        0 |       0 |
|           3 | 2021-06-02 |    22 |          10 |       2 |       10 |       0 |
|           3 | 2021-06-03 |    10 |          10 |       0 |        0 |       0 |
|           3 | 2021-06-05 |    11 |          10 |       1 |        0 |       0 |
|           4 | 2021-06-01 |    11 |           8 |       3 |        0 |       0 |
|           5 | 2021-06-01 |    10 |           1 |       9 |        0 |       0 |
+-------------+------------+-------+-------------+---------+----------+---------+

此报告的查询是:

select va.employee_id, va.date,
       count(*) as total,
       sum(case when status = 1 then 1 else 0 end) as not_visited,
       sum(case when status = 2 then 1 else 0 end) as visited,
       sum(case when status = 3 then 1 else 0 end) as canceled,
       sum(case when status = 4 then 1 else 0 end) as pending
from visiting_addresses va
where va.date >= '2021-06-01'
  and va.date <= '2021-06-30'
group by va.employee_id, va.date;

如果您查看结果,对于 employee_id = 3,日期2021-06-04没有条目。也没有从 2021-06-06 到 2021-06-30 的数据。 我将不得不在结果中包含这个日期。 所以我尝试创建另一个查询来生成给定范围之间的日期。 以下查询将执行此操作

SELECT gen_date
  FROM
    (SELECT v.gen_date
       FROM
         (SELECT ADDDATE('1970-01-01',t4 * 10000 + t3 * 1000 + t2 * 100 + t1 * 10 + t0) gen_date
            FROM
              (SELECT 0 t0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION 
       SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION
           SELECT 8 UNION SELECT 9) t0,
          (SELECT 0 t1 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION
           SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION
           SELECT 8 UNION SELECT 9) t1,
          (SELECT 0 t2 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION
           SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION
           SELECT 8 UNION SELECT 9) t2,
          (SELECT 0 t3 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION
           SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION
           SELECT 8 UNION SELECT 9) t3,
          (SELECT 0 t4 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION
           SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION
           SELECT 8 UNION SELECT 9) t4
          ) v
    WHERE v.gen_date BETWEEN '2021-06-01' AND '2021-06-30'
 ) calendar;

此查询将生成如下日期:

+------------+
| gen_date   |
+------------+
| 2021-06-01 |
| 2021-06-02 |
| 2021-06-03 |
| .......... |
| ...........|
| 2021-06-27 |
| 2021-06-28 |
| 2021-06-29 |
| 2021-06-30 |
+------------+

现在的问题是,我如何以某种方式加入以上两个查询,以便对于每个employee_id ,结果中都存在所有日期? 或者甚至有可能以这种方式? (实际表包含 500 万行。employee_id 列的基数为 3k++,date 和 employee_id 列被索引)

您标记了 MySQL 和 MariaDB。 这两个 DBMS 是亲戚,但它们仍然是不同的 DBMS。 在 MariaDB 中,您可以使用内置seq轻松生成系列:

select date '2021-06-01' + interval seq day as date from seq_0_to_29

在 MySQL 中,这是不可用的,您可能会为此使用递归查询:

with recursive dates (date) as
(
  select date '2021-06-01'
  union all
  select date + interval 1 day
  from dates
  where date < date '2021-06-30'
)

在递归查询中,您当然可以动态生成日期,例如表中上个月的日期,或者当前和上个月的日期。

在任何 SQL 方言中,您都可以加入查询。 在您的情况下,您希望所有日期(如图所示生成)与所有员工(通过从员工表中选择)或仅与您的访问地址表中存在的员工相结合。 如果您只希望表中有数据的员工,请使用:

select distinct employee_id from visiting_addresses

为了获得所有组合,您将交叉连接两个数据集。 然后,您将表中的数据从外部连接起来,以便在没有访问的情况下保留员工/日期。

查询格式为:

select
  employees.employee_id,
  dates.date,
  visits.total,
  visits.not_visited,
  ...
from ( <date sequence query here> ) dates
cross join ( <employee table query here> ) employees
left outer join ( <visits table query here> ) visits
  on visits.date = dates.date
  and visits.employee_id = employees.employee_id
order by employees.employee_id, dates.date;

(如果您希望所有员工都这样做,那么只需将( <employee table query here> ) employees替换为表名employees

为了便于阅读,您可能更喜欢WITH子句:

with recursive dates (date) as ( <date sequence query here> )
   , employees as ( <employee table query here> )
   , visits as ( <visits table query here> )
select 
  employees.employee_id,
  dates.date,
  visits.total,
  visits.not_visited,
  ...
from  dates
cross join employees
left outer join visits
  on visits.date = dates.date
  and visits.employee_id = employees.employee_id
order by employees.employee_id, dates.date;

你提到你的桌子很大。 我建议此查询使用以下索引:

create index idx on visiting_addresses (date, employee_id, status);

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM