[英]How to calculate median of an inner join field when using group by?
我有以下查詢,可在其中查詢特定項目的銷售數量以及每天這些銷售的平均價格。
SELECT COUNT(1) AS num_sales, DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date, AVG(prices.price) AS avg_price
FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503 AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
GROUP BY date
ORDER BY date ASC
我也有一個for循環,每天都會做一個單獨的查詢來獲取中位數價格(假設結果數是偶數):
SELECT prices.price FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503
AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
AND DATE(sales.created_at) = "<THE DATE OF THE CURRENT FOR-LOOP OBJECT>"
ORDER BY prices.price ASC
LIMIT 1 OFFSET <NUMBER OF THE MIDDLE ROW>
可以想象,這非常慢,因為在某些情況下,必須在一個大表上(銷售表有幾億行)執行數百個查詢。
如何重寫第一個SQL查詢,以便它也可以計算prices.price
的中位數,類似於AVG(prices.price)
? 我看了答案,比如這一個 ,但不能換我圍繞如何去適應它為我的特定方案的頭。
我花了幾個小時來嘗試完成此任務,但是我的SQL知識還不夠好。 任何幫助將不勝感激!
root@ns525077:~# mysql -V
mysql Ver 14.14 Distrib 5.7.13, for Linux (x86_64) using EditLine wrapper
表模式:
CREATE TABLE `prices` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`item_id` int(11) unsigned NOT NULL,
`price` decimal(8,2) NOT NULL,
`net_price` decimal(8,2) NOT NULL,
`source` tinyint(4) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`),
KEY `prices_ibfk_1` (`item_id`),
CONSTRAINT `prices_ibfk_1` FOREIGN KEY (`item_id`) REFERENCES `items` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=4861375 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
CREATE TABLE `sales` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`price_id` int(11) unsigned DEFAULT NULL,
`item_key` varchar(40) COLLATE utf8_unicode_ci NOT NULL,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`),
UNIQUE KEY `item_key` (`item_key`),
KEY `price_id` (`price_id`),
KEY `created_at` (`created_at`),
KEY `price_id__created_at__IX` (`price_id`,`created_at`),
CONSTRAINT `sales_ibfk_1` FOREIGN KEY (`price_id`) REFERENCES `prices` (`id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=386156944 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
我的第一個查詢的輸出示例:
經過廣泛搜索,我在這里找到了問題的答案。 也許我一開始沒有說出我的問題。
我已經根據自己的情況調整了解決方案,以下是工作查詢:
SELECT COUNT(1) AS num_sales,
DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date,
AVG(prices.price) AS avg_price,
CASE(COUNT(1) % 2)
WHEN 1 THEN SUBSTRING_INDEX(
SUBSTRING_INDEX(
group_concat(prices.price
ORDER BY prices.price SEPARATOR ',')
, ',', (count(*) + 1) / 2)
, ',', -1)
ELSE (SUBSTRING_INDEX(
SUBSTRING_INDEX(
group_concat(prices.price
ORDER BY prices.price SEPARATOR ',')
, ',', count(*) / 2)
, ',', -1)
+ SUBSTRING_INDEX(
SUBSTRING_INDEX(
group_concat(prices.price
ORDER BY prices.price SEPARATOR ',')
, ',', (count(*) + 1) / 2)
, ',', -1)) / 2
END median_price
FROM sales
INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7381
AND (`prices`.`source` = 0
OR (`prices`.`price` >= 400
AND `prices`.`source` > 0))
GROUP BY date
ORDER BY date ASC;
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.