简体   繁体   English

按7天增量将MySQL结果分组

[英]Grouping MySQL results by 7 day increments

Hoping someone might be able to assist me with this. 希望有人可以帮助我。

Assume I have the table listed below. 假设我有下表列出。 Hosts can show up multiple times on the same date, usually with different backupsizes. 主机可以在同一日期显示多次,通常备份大小不同。

+------------------+--------------+
| Field            | Type         | 
+------------------+--------------+
| startdate        | date         |
| host             | varchar(255) | 
| backupsize       | float(6,2)   |  
+------------------+--------------+

How could I find the sum total of backupsize for 7 day increments starting with the earliest date , through the last date? 我如何找到从最早的日期到最后的日期连续7天递增的backupsize的总和? I don't mind if the last few days get cut off because they don't fall into a 7 day increment. 我不介意最近几天是否因为不以7天为增量而中断。

Desired output (prefered): 所需输出(首选):

+------------+----------+----------+----------+-----
|Week of     | system01 | system02 | system03 | ...
+------------+----------+----------+----------+-----
| 2014/07/30 | 2343.23  | 232.34   | 989.34   |
+------------+----------+----------+----------+-----
| 2014/08/06 | 2334.7   | 874.13   | 234.90   |
+------------+----------+----------+----------+-----
| ...        | ...      | ...      | ...      |

OR 要么

+------------+------------+------------+------
|Host        | 2014/07/30 | 2014/08/06 | ...
+------------+------------+------------+------
| system01   | 2343.23    | 2334.7     | ...  
+------------+------------+------------+-------
| system02   | 232.34     | 874.13     | ...
+------------+------------+------------+-------
| system03   | 989.34     | 234.90     | ...
+------------+------------+------------+-------
| ...        | ...        | ...        |       

Date format is not a concern, just as long as it gets identified somehow. 只要以某种方式识别日期格式,就不必担心日期格式。 Also, the order of the hosts is not a concern either. 同样,主机的顺序也不在乎。 Thanks! 谢谢!

The simplest way is to get the earliest date and just count the number of days: 最简单的方法是获取最早的日期并只计算天数:

select x.minsd + interval floor(datediff(x.minsd, lb.startdate) / 7) day as `Week of`,
       host,
       sum(backupsize)
from listedbelow lb cross join
     (select min(startdate) as minsd from listedbelow lb) x
group by floor(datediff(x.minsd, lb.startdate) / 7)
order by 1;

This produces a form with week of and host on each row. 这将产生一个表格, week of一个host You can pivot the results as you see fit. 您可以根据需要调整结果。

I'll assume that what you want is the sum of bakcupsize grouped by host and that seven-day interval you are talking about. 我假设您想要的是按host分组的bakcupsize的总和,即您所说的7天间隔。

My solution would be something like this: 我的解决方案是这样的:

  1. You need to define the first date, and then "create" a column with the date you want (the end of the seven-day period) 您需要定义第一个日期,然后用所需的日期(七天期限的结尾)“创建”一列
  2. Then I would group it. 然后,我将其分组。

I think temporary tables and little tricks with temp variables are the best way to tackle this, so: 我认为临时表和带有临时变量的小技巧是解决此问题的最佳方法,因此:

drop table if exists temp_data;
create temporary table temp_data
select a.*
     -- The @d variable will have the date that you'll use later to group the data.
     , @d := case
          -- If the current "host" value is the same as the previous one, then...
           when @host_prev = host then 
               -- ... if @d is not null and is within the seven-day period,
               -- then leave the value of @d intact; in other case, add 7 days to it.
               case
                   when @d is not null or a.startdate <= @d then @d
                   -- The coalesce() function will return the first not null argument
                   -- (just as a precaution)
                   else dateadd(coalesce(@d, a.startdate), interval +7 day)
               end
           -- If the current "host" value is not  the same as the previous one, 
           -- then take the current date (the first date of the "new" host) and add
           -- seven days to it.
           else @d = dateadd(a.startdate, interval +7 day) 
       end as date_group
     -- This is needed to perform the comparisson in the "case" piece above
     , @host_prev := a.host as host2
from
       (select @host_prev = '', @d = null) as init -- Initialize the variables
     , yourtable as a
-- IMPORTANT: This will only work if you order the data properly
order by a.host, a.startdate;
-- Add indexes to the temp table, to make things faster
alter table temp_data
   add index h(host),
   add index dg(date_group)
   -- OPTIONAL: You can drop the "host2" column (it is no longer needed)
   -- , drop column host2
   ;

Now, you can get the grouped data: 现在,您可以获取分组的数据:

select a.host, a.date_group, sum(a.bakcupsize) as backupsize
from temp_data as a
group by a.host, a.date_group;

This will give you the unpivoted data. 这将为您提供可靠的数据。 If you want to build a pivot table with it, I recommend you take a look to this article , and/or read this question and its answers . 如果要使用它构建数据透视表,建议您看一下本文 ,和/或阅读此问题及其答案 In short, you'll have to build a "dynamic" sql instruction, prepare a statement with it and execute it. 简而言之,您将必须构建“动态” sql指令,并准备一条语句并执行它。


Of course, if you want to group this by week, there's a simpler approach: 当然,如果您想按周将其分组,则有一个更简单的方法:

drop table if exists temp_data2;
create temporary table temp_data2
select a.*
     -- The following will give you the end-of-week date
     , dateadd(a.startdate, interval +(6 - weekday(a.startdate)) day) as group_date
from yourtable as a;
alter table temp_data
   add index h(host),
   add index dg(date_group);
select a.host, a.date_group, sum(a.bakcupsize) as backupsize
from temp_data as a
group by a.host, a.date_group;

I leave the pivot part to you. 我把关键部分留给你。

So I was able to determine a solution that fit my needs using a procedure I created by putting together concepts from your recommended solutions as well as some other other solutions I found on this site. 因此,通过将建议的解决方案中的概念以及在此站点上找到的其他一些其他解决方案放在一起,可以使用我创建的过程确定适合自己需求的解决方案。 The procedure SUM's by 7 day increments as well as does a pivot. 程序SUM的增量为7天,并且做枢纽。

DELIMITER $$
CREATE PROCEDURE `weekly_capacity_by_host`()
BEGIN
SELECT MIN(startdate) into @start_date FROM testtable;

SET @SQL = NULL;
SELECT
  GROUP_CONCAT(DISTINCT
    CONCAT(
      'SUM(if(host=''',host,''', backupsize, 0)) as ''',host,''''
    )
  ) INTO @SQL
FROM testtable;

SET @SQL = CONCAT('SELECT 1 + DATEDIFF(startdate, ''',@start_date,''') DIV 7 AS week_num
  , ''',@start_date,''' + INTERVAL (DATEDIFF(startdate, ''',@start_date,''') DIV 7) WEEK AS week_start,
  ', @SQL,' 
  FROM testtable group by week_num'
);

PREPARE stmt FROM @SQL;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END$$
DELIMITER ;

Output appears as follows: 输出显示如下:

mysql> call weekly_capacity_by_host;
+----------+------------+----------+----------+----------+----------+
| week_num | week_start | server01 | server02 | server03 | server04 |
+----------+------------+----------+----------+----------+----------+
|        1 | 2014-06-11 |  1231.08 |    37.30 |    12.04 |    68.17 |
|        2 | 2014-06-18 |  1230.98 |    37.30 |    11.76 |    68.13 |
|        3 | 2014-06-25 |  1243.12 |    37.30 |     8.85 |    68.59 |
|        4 | 2014-07-02 |  1234.73 |    37.30 |    11.77 |    67.80 |
|        5 | 2014-07-09 |   341.32 |     0.04 |     0.14 |     4.94 |
+----------+------------+----------+----------+----------+----------+
5 rows in set (0.03 sec)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM