簡體   English   中英

使用group by時如何計算內部聯接字段的中位數?

[英]How to calculate median of an inner join field when using group by?

我有以下查詢,可在其中查詢特定項目的銷售數量以及每天這些銷售的平均價格。

SELECT COUNT(1) AS num_sales, DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date, AVG(prices.price) AS avg_price
FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503 AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
GROUP BY date
ORDER BY date ASC

我也有一個for循環,每天都會做一個單獨的查詢來獲取中位數價格(假設結果數是偶數):

SELECT prices.price FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503 
AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
AND DATE(sales.created_at) = "<THE DATE OF THE CURRENT FOR-LOOP OBJECT>"
ORDER BY prices.price ASC
LIMIT 1 OFFSET <NUMBER OF THE MIDDLE ROW>

可以想象,這非常慢,因為在某些情況下,必須在一個大表上(銷售表有幾億行)執行數百個查詢。

如何重寫第一個SQL查詢,以便它也可以計算prices.price的中位數,類似於AVG(prices.price) 我看了答案,比如這一個 ,但不能換我圍繞如何去適應它為我的特定方案的頭。

我花了幾個小時來嘗試完成此任務,但是我的SQL知識還不夠好。 任何幫助將不勝感激!

root@ns525077:~# mysql -V
mysql  Ver 14.14 Distrib 5.7.13, for Linux (x86_64) using  EditLine wrapper

表模式:

CREATE TABLE `prices` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `item_id` int(11) unsigned NOT NULL,
 `price` decimal(8,2) NOT NULL,
 `net_price` decimal(8,2) NOT NULL,
 `source` tinyint(4) NOT NULL,
 `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 `updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 PRIMARY KEY (`id`),
 UNIQUE KEY `id` (`id`),
 KEY `prices_ibfk_1` (`item_id`),
 CONSTRAINT `prices_ibfk_1` FOREIGN KEY (`item_id`) REFERENCES `items` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=4861375 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `sales` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `price_id` int(11) unsigned DEFAULT NULL,
 `item_key` varchar(40) COLLATE utf8_unicode_ci NOT NULL,
 `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 `updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 PRIMARY KEY (`id`),
 UNIQUE KEY `id` (`id`),
 UNIQUE KEY `item_key` (`item_key`),
 KEY `price_id` (`price_id`),
 KEY `created_at` (`created_at`),
 KEY `price_id__created_at__IX` (`price_id`,`created_at`),
 CONSTRAINT `sales_ibfk_1` FOREIGN KEY (`price_id`) REFERENCES `prices` (`id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=386156944 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

我的第一個查詢的輸出示例:

我的第一個查詢的輸出示例

經過廣泛搜索,我在這里找到了問題的答案。 也許我一開始沒有說出我的問題。

我已經根據自己的情況調整了解決方案,以下是工作查詢:

SELECT COUNT(1) AS num_sales,
       DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date,
       AVG(prices.price) AS avg_price,
       CASE(COUNT(1) % 2)
       WHEN 1 THEN SUBSTRING_INDEX(
           SUBSTRING_INDEX(
               group_concat(prices.price
                            ORDER BY prices.price SEPARATOR ',')
               , ',', (count(*) + 1) / 2)
           , ',', -1)
       ELSE (SUBSTRING_INDEX(
                 SUBSTRING_INDEX(
                     group_concat(prices.price
                                  ORDER BY prices.price SEPARATOR ',')
                     , ',', count(*) / 2)
                 , ',', -1)
             + SUBSTRING_INDEX(
                 SUBSTRING_INDEX(
                     group_concat(prices.price
                                  ORDER BY prices.price SEPARATOR ',')
                     , ',', (count(*) + 1) / 2)
                 , ',', -1)) / 2
       END median_price
FROM sales
  INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7381
      AND (`prices`.`source` = 0
           OR (`prices`.`price` >= 400
               AND `prices`.`source` > 0))
GROUP BY date
ORDER BY date ASC;

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM