简体   繁体   English

根据一个MySQL查询中的不同日期范围选择平均值语句

[英]select statement for averages based on different date ranges in one MySQL query

Basically I am attempting to make a chart with this data. 基本上我试图用这些数据制作图表。 I am able to put my query into a while loop in PHP to get each average, but I would prefer this was done with one query producing one result table. 我可以将我的查询放入PHP中的while循环以获得每个平均值,但我更希望这是通过一个查询生成一个结果表来完成的。

<?php 

date_default_timezone_set('America/Los_Angeles');

include('../connect.php');

$subcategory = 'T-Shirts';

$date = date('Y-m-d', strtotime('-29 days'));
$today = date("Y-m-d");

$subcategory = mysqli_real_escape_string($conp, $subcategory);

echo "<table border=\"1\">";
echo "<tr>";
echo "<th>date</th>";
echo "<th>average</th>";
echo "</tr>";

while (strtotime($date) <= strtotime($today)) {

    $from_date = date ("Y-m-d", strtotime("-29 day", strtotime($date)));

    $query = $conp->query("SELECT ROUND(SUM(OutCount)/30) AS 'average' FROM inventory
    LEFT JOIN item
    ON inventory.itemcode = item.itemcode
    WHERE item.subcategory = '$subcategory'
    AND TrDateTime BETWEEN '$from_date' AND '$date' AND transactiontype like 'OUT_%'"); 

    if($query->num_rows){       
        while($row = mysqli_fetch_array($query, MYSQL_ASSOC)){                      
            if(!empty($row['average'])){
                $average = $row['average'];
            }else{
                $average = "N/A";
            }
        }                       
        mysqli_free_result($query);                             
    }else{
        $average = "N/A";
    }

    $date = date ("Y-m-d", strtotime("+1 day", strtotime($date)));

    echo "<tr>";
    echo "<td>" . $date . "</td>";
    echo "<td>" . $average . "</td>";
    echo "</tr>";
}

echo "</table>";

?>

I get all the dates in the past 30 days (including today) and the average sales from a range of 29 days prior until that date. 我获取过去30天(包括今天)的所有日期以及截至该日期前29天的平均销售额。

+------------+----------+  
| date       | average  |  
+------------+----------+  
| 2015-04-09 | 222      |  
| 2015-04-10 | 225      |  
| 2015-04-11 | 219      |  
| ...        | ...      |  
+------------+----------+  

I am able to get everything I need this way, but it is running 29 queries in this situation and MySQL would be substantially quicker. 我能够以这种方式获得我需要的所有东西,但是在这种情况下运行29次查询,MySQL会更快。 I started to come up with a MySQL procedure, but I am not sure how well this will work when I try and call it with PHP. 我开始提出一个MySQL程序,但我不确定当我尝试用PHP调用它时它会有多好用。

DELIMITER //
    CREATE PROCEDURE average_daily_sales()
    BEGIN

        SET @today = CURDATE();
        SET @date_var = CURDATE() - INTERVAL 29 DAY;
        SET @from_date = @date_var - INTERVAL 29 DAY;
        SET @to_date = @from_date + INTERVAL 29 DAY;

        label1: WHILE @date_var < @today DO

            SELECT      DATE_FORMAT(trdatetime, '%Y-%m-%d') as 'date', ROUND(SUM(OutCount)/30) AS 'average'
            FROM        inventory
            LEFT JOIN   item
            ON          inventory.itemcode = item.itemcode
            WHERE       item.subcategory = 'T-Shirts'
            AND         trdatetime BETWEEN @from_date - INTERVAL 29 DAY AND @to_date
            AND         transactiontype like 'OUT_%';

            SET @date_var = @date_var + INTERVAL 1 DAY;

        END WHILE label1;    

    END; //
DELIMITER ;

Ultimately, I would prefer a regular MySQL statement that I can use to produce the desired result table in one shot. 最后,我更喜欢一个常规的MySQL语句,我可以用它来一次性生成所需的结果表。 Any help would be greatly appreciated. 任何帮助将不胜感激。

If you create a calender table and populate that with a range of date values, eg 如果您创建日历表并使用一系列日期值填充该日历表,例如

CREATE TABLE cal (dt DATE NOT NULL PRIMARY KEY) ;
INSERT INTO cal VALUES ('2015-04-01'),('2015-04-02'),('2015-04-03'), ... ;

you could use that as a row source, in a query like this: 你可以在这样的查询中使用它作为行源:

SELECT cal.dt
     , ( -- correlated subquery references value returned from cal
         SELECT ROUND(SUM(n.OutCount)/30)
           FROM inventory n
           JOIN item t
             ON t.itemcode = n.itemcode
          WHERE t.subcategory = 'foo'
            AND n.TrDateTime >= cal.dt + INTERVAL -28 DAY
            AND n.TrDateTime <  cal.dt + INTERVAL 1 DAY
            AND n.transactiontype LIKE 'OUT_%'
       ) AS `average`
  FROM cal
 WHERE cal.dt >= '2015-04-01'
   AND cal.dt <  '2015-05-01'
 ORDER BY cal.dt

It's not mandatory to create a cal calendar table. 创建cal日历表不是必需的。 We could use an inline view and give it an alias of cal . 我们可以使用内联视图并为其指定cal的别名。 For example, in the query above, we could replace this line: 例如,在上面的查询中,我们可以替换此行:

  FROM cal

with this: 有了这个:

  FROM ( SELECT DATE('2015-04-01') AS dt
         UNION ALL SELECT DATE('2015-04-02')
         UNION ALL SELECT DATE('2015-04-03')
         UNION ALL SELECT DATE('2015-04-04')
         UNION ALL SELECT DATE('2015-04-05')
       ) cal

Or, if you have a rowsource that can give you a contiguous series of integers, starting at zero up t you could manufacture your date values from a base date, for example 或者,如果你有一个行源可以给你一个连续的整数系列,从零开始你可以从基准日期开始制造你的日期值,例如

   FROM ( SELECT '2014-04-01' + INTERVAL i.n DAY
            FROM source_of_integers i
           WHERE i.n >= 0
             AND i.n < 31
           ORDER BY i.n
        ) cal

Some notes: 一些说明:

The original query shows an outer ( LEFT ) join, but the equality predicate in the WHERE clause negates the "outerness" of the join, it's equivalent to an inner join. 原始查询显示外部( LEFT )连接,但WHERE子句中的等式谓词否定了连接的“外部性”,它等同于内部连接。

Some of the column references in the query are not qualified. 查询中的某些列引用不合格。 Best practice is to qualify all column references, then the reader can understand which columns are coming from which tables, without requiring the reader to be familiar with which columns are in which tables. 最佳实践是限定所有列引用,然后读者可以了解哪些列来自哪些表,而无需读者熟悉哪些列在哪些表中。 This also protects the statement from breaking in the future (with an "ambiguous column" error) when a column that has the same name is added to another table referenced in the query.) 当将具有相同名称的列添加到查询中引用的另一个表时,这还可以保护语句在将来不会中断(带有“模糊列”错误)。)

FOLLOWUP 跟进

Personally, for a limited number of date values, I'd go with the inline view that doesn't reference a table. 就个人而言,对于有限数量的日期值,我会使用不引用表的内联视图。 I'd have the PHP code generate that query for me. 我有PHP代码为我生成该查询。

With a starting date, say it's '2015-04-10', I'd take that date value and format it into a query, equivalent doing this: 有了一个开始日期,比如'2015-04-10',我会把这个日期值和格式化为一个查询,相当于这样做:

$cal = "SELECT DATE('2015-04-10') AS dt" ;

Then I'd spin through a loop, and increment that date value by 1 day. 然后我旋转一个循环,并将该日期值增加1天。 Each time through the loop, I'd appending to $cal a select of the next date, the net effect of running through the loop three times would be equivalent to doing this: 每次循环时,我都会向$cal附加下一个日期的选择,通过循环三次运行的净效果相当于这样做:

$cal .= " UNION ALL SELECT DATE('2015-04-11')";
$cal .= " UNION ALL SELECT DATE('2015-04-12')";
$cal .= " UNION ALL SELECT DATE('2015-04-13')";

As a less attractive alternative, we could keep repeating the same value of the start date, and just increment an integer value, and let MySQL do the date math for us. 作为一个不太吸引人的选择,我们可以不断重复开始日期的相同值,只需增加一个整数值,让MySQL为我们做日期数学。

$cal .= " UNION ALL SELECT '2015-04-10' + INTERVAL 1 DAY";
$cal .= " UNION ALL SELECT '2015-04-10' + INTERVAL 2 DAY";
$cal .= " UNION ALL SELECT '2015-04-10' + INTERVAL 3 DAY";

Then, I'd just slide the $cal query into the SQL text as an inline view query. 然后,我只是将$cal查询作为内联视图查询滑入SQL文本。 Something like this: 像这样的东西:

$sql = "SELECT cal.dt
             , ( SELECT IFNULL(ROUND(SUM
                 ,0) AS average_
          FROM ( " . $cal . " ) cal
          LEFT
          JOIN item ON ... ";

Anyway, that's the approach I'd take if this was for a limited number of date values (a couple dozen or so), and if I was only going to be running this query occasionally, not hammering the database server with this query repeatedly, for every request.) If I was going to pound the server, I'd create and maintain a real cal table, rather than incur the overhead of materializing a derived table on every query. 无论如何,这是我采用的方法,如果这是有限数量的日期值(几十个左右),如果我只是偶尔运行此查询,而不是反复使用此查询锤击数据库服务器,对于每个请求。)如果我要敲击服务器,我将创建并维护一个真正的cal表,而不是在每个查询中产生实现派生表的开销。

Do you have data on each distinct day in the range? 您是否拥有该范围内每个不同日期的数据? If so, this is a slightly complex join operation, but very doable. 如果是这样,这是一个稍微复杂的连接操作,但非常可行。

You can get the date ranges you need as follows: 您可以按如下方式获取所需的日期范围:

        SELECT DISTINCT
               DATE(trdatetime)- INTERVAL 30 DAY AS startdate,
               DATE(trdatetime)                  AS enddateplus1
          FROM inventory
         WHERE trdatetime >= NOW() - INTERVAL 31 DAY

Debug this query. 调试此查询。 Take a look to make sure you get each date range you want. 请查看以确保获得所需的每个日期范围。

Then you can join this to your business query like so 然后,您可以将此加入到您的业务查询中

  SELECT dates.startdate, 
         ROUND(SUM(OutCount)/30) AS 'average'
   FROM (
        SELECT DISTINCT
               DATE(trdatetime)- INTERVAL 30 DAY AS startdate,
               DATE(trdatetime)                  AS enddateplus1
          FROM inventory
         WHERE trdatetime >= NOW() - INTERVAL 31 DAY
        ) dates
   LEFT JOIN inventory  ON i.trdatetime >= dates.startdate
                       AND i.trdatetime <  dates.enddateplus1 
   LEFT JOIN  item ON  i.itemcode = item.itemcode
  WHERE item.subcategory = 'T-Shirts'
    AND transactiontype like 'OUT_%'
  GROUP BY dates.startdate

If your inventory data is sparse, that is, you don't have transactions on all days, then your dates query will be missing some rows. 如果您的库存数据稀少,也就是说,您没有所有日期的交易,那么您的日期查询将缺少某些行。

There's a way to fill in those missing rows. 有一种方法可以填补那些缺失的行。 But it's a pain in the s . 但这是一个痛苦的s Read this for more info. 阅读本文以获取更多信息。 http://www.plumislandmedia.net/mysql/filling-missing-data-sequences-cardinal-integers/ http://www.plumislandmedia.net/mysql/filling-missing-data-sequences-cardinal-integers/

Notice that BETWEEN works very badly indeed for filtering DATETIME or TIMESTAMP values. 请注意, BETWEEN在过滤DATETIMETIMESTAMP值方面确实非常糟糕。

The suggestions from @OllieJones and @spencer7593 either required a 'transaction' to take place every day in order to utilize SELECT DISTINCT DATE(trdatetime) , you needed to create another table, or you needed to generate a derived table. 来自@OllieJones和@ spencer7593的建议要么每天都要进行“事务”才能使用SELECT DISTINCT DATE(trdatetime) ,需要创建另一个表,或者需要生成派生表。

SELECT DISTINCT DATE(trdatetime) wasn't an option for me because I did not have transactions for everyday. SELECT DISTINCT DATE(trdatetime)对我来说不是一个选项,因为我没有日常交易。

The hybrid PHP and MySQL example that @spencer7593 suggested would generate a derived table very well. @ spencer7593建议的混合PHP和MySQL示例将很好地生成派生表。 In the end it took the static version about 1.8 seconds to get a result. 最后,静态版本需要大约1.8秒才能得到结果。 The issue being that you would need additional PHP to generate this... (see @spencer7593 answer) 问题是你需要额外的PHP来生成这个...(参见@ spencer7593答案)

SELECT cal.dt
     , ( -- correlated subquery references value returned from cal
         SELECT ROUND(SUM(n.OutCount)/30)
           FROM inventory n
           JOIN item t
             ON t.itemcode = n.itemcode
          WHERE t.subcategory = 'foo'
            AND n.TrDateTime >= cal.dt + INTERVAL -28 DAY
            AND n.TrDateTime <  cal.dt + INTERVAL 1 DAY
            AND n.transactiontype LIKE 'OUT_%'
       ) AS `average`
  FROM ( SELECT DATE('2015-04-01') AS dt
        UNION ALL SELECT DATE('2015-04-02')
        UNION ALL SELECT DATE('2015-04-03')
        UNION ALL SELECT DATE('2015-04-04')
        UNION ALL SELECT DATE('2015-04-05')
        UNION ALL SELECT DATE('2015-04-06')
etc...
       ) cal
 WHERE cal.dt >= '2015-04-01'
   AND cal.dt <  '2015-05-01'
 ORDER BY cal.dt

I am attempted to use another one of @spencer7593 answers. 我试图使用另一个@ spencer7593答案。 I created a "source of integers" table with the numbers 0-31 as he suggested. 我按照他的建议创建了一个“整数来源”表,数字为0-31。 This method took a little over 1.8 seconds. 这种方法花了1.8秒多一点。

SELECT cal.sd, cal.ed
     , ( -- correlated subquery references value returned from cal
         SELECT ROUND(SUM(n.OutCount)/30)
           FROM inventory n
           JOIN item t
             ON t.itemcode = n.itemcode
          WHERE t.subcategory = 'foobar'
            AND n.TrDateTime >= cal.ed + INTERVAL -30 DAY
            AND n.TrDateTime <  cal.ed + INTERVAL 1 DAY
            AND n.transactiontype LIKE 'OUT_%'
       ) AS `average`
  FROM ( SELECT (CURDATE() + INTERVAL -30 DAY) + INTERVAL i.n DAY as `ed`, (((CURDATE() + INTERVAL -30 DAY) + INTERVAL i.n DAY) + INTERVAL - 30 DAY) as `sd`
            FROM source_of_integers i
           WHERE i.n >= 0
             AND i.n < 31
           ORDER BY i.n
        ) cal
WHERE cal.ed >= CURDATE() + INTERVAL -29 DAY
   AND cal.ed <=  CURDATE()
 ORDER BY cal.ed;

You need a rowsource for these dates, there isn't really a way around that. 这些日期需要一个行源,但实际上还没有办法解决这个问题。 In the end I made a cal table.. 最后我做了一张表...

CREATE TABLE cal (
    dt DATE NOT NULL PRIMARY KEY
);

CREATE TABLE ints ( i tinyint );

INSERT INTO ints VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

INSERT INTO cal (dt)
SELECT DATE('2010-01-01') + INTERVAL a.i*10000 + b.i*1000 + c.i*100 + d.i*10 + e.i DAY
FROM ints a JOIN ints b JOIN ints c JOIN ints d JOIN ints e
WHERE (a.i*10000 + b.i*1000 + c.i*100 + d.i*10 + e.i) <= 3651
ORDER BY 1;

And then ran a slightly modified version of @spencer7593 answer on it.. 然后在它上面运行了一个稍微修改过的@ spencer7593答案..

SELECT cal.dt
     , ( -- correlated subquery references value returned from cal
         SELECT ROUND(SUM(n.OutCount)/30)
           FROM inventory n
           JOIN item t
             ON t.itemcode = n.itemcode
          WHERE t.subcategory = 'foo'
            AND n.TrDateTime >= cal.dt + INTERVAL -28 DAY
            AND n.TrDateTime <  cal.dt + INTERVAL 1 DAY
            AND n.transactiontype LIKE 'OUT_%'
       ) AS `average`
  FROM cal
WHERE cal.dt >= CURDATE() + INTERVAL -30 DAY
    AND cal.dt <  CURDATE()
ORDER BY cal.dt;

In my opinion, I believe this is the cleanest (less PHP) and highest performing answer. 在我看来,我相信这是最干净(较少PHP)和最高性能的答案。

Here is how I indexed the inventory table to speed it up substantially: 以下是我对库存表进行索引以大幅加快速度的方法:

ALTER TABLE inventory ADD KEY (ItemCode, TrDateTime, TransactionType);

Thank you @OllieJones and @spencer7593 for all of your help! 感谢@OllieJones和@ spencer7593的所有帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM