简体   繁体   English

MySQL (MariaDB) 上的第 5 个百分位

[英]5th percentile on MySQL (MariaDB)

I'm trying to find the 95th percentile (and the highest buy) of the item-price using the order in my ~300k row table.我正在尝试使用我的 ~300k 行表中的订单找到项目价格的第 95 个百分位(和最高购买)。

I've been successful in finding the 95th percentile and the highest buy for one single item with this code:我已经成功地找到了第 95 个百分位数和使用此代码的单个商品的最高购买量:

 SELECT type_id,
       Max(price) AS buy,
       Min(price) AS '95th% buy'
FROM   (SELECT *,
               ( Row_number()
                   OVER (
                     partition BY type_id
                     ORDER BY price DESC) ) AS rownr
        FROM   orderbuffertest AS rownr
        WHERE  is_buy_order = 1
        ORDER  BY ( Row_number()
                      OVER (
                        partition BY type_id
                        ORDER BY price DESC) ) ASC) AS t1
WHERE  t1.type_id = 44992
       AND t1.rownr < (SELECT Count(*)
                       FROM   orderbuffertest
                       WHERE  is_buy_order = 1
                              AND type_id = 44992) * 0.05;  

However, now I'm trying to GROUP BY type_id and it's messing up all my values.但是,现在我正在尝试对GROUP BY type_id进行GROUP BY type_id并且它弄乱了我的所有值。

Does anybody have an idea of how to GROUP BY type_id this query?有没有人知道如何GROUP BY type_id这个查询? Maybe even ways to improve the original one?也许甚至可以改进原始方法?

I thank you in advance,我提前谢谢你,

TheJozzle TheJozzle

Ps.附言。 Here's a link to my database, if you'd like to mess/test around with it: https://gofile.io/?c=Ga6ODr这是我的数据库的链接,如果您想弄乱/测试它: https : //gofile.io/?c=Ga6ODr

This query should give you the results you want.这个查询应该给你你想要的结果。 It allocates a ROW_NUMBER by price as well as counting all rows for each type_id and order type ( is_buy_order ) in a CTE, then selects the MAX price as the buy price (for is_buy_order = 1 ), and the minumum price for rows >= the 95th percentile as the 95th percentile price.它分配一个ROW_NUMBERprice以及计数每一个中的所有行type_id和顺序类型( is_buy_order在CTE),然后选择MAX价格作为buy价格(对于is_buy_order = 1 ),和的最少价格行> =的第 95 个百分点作为第 95 个百分点的价格。 In the event that there are no rows in the 95th percentile other than the highest price, the second highest price is returned.如果第 95 个百分位数中除了最高价格之外没有其他行,则返回第二高价格。 Similar logic applies to the generation of the sell and 95th%sell prices:同样的逻辑也适用于产生sell95th%sell价格:

WITH prices AS (
  SELECT type_id, price, is_buy_order,
         ROW_NUMBER() OVER (PARTITION BY type_id, is_buy_order ORDER BY price DESC) AS rownr,
         COUNT(*) OVER (PARTITION BY type_id, is_buy_order) AS num_rows
  FROM   orderbuffertest
)
SELECT type_id,
       MAX(CASE WHEN is_buy_order = 1 THEN price END) AS buy,
       COALESCE(MIN(CASE WHEN is_buy_order = 1 AND 100.0 * (rownr - 1) / num_rows <= 5 AND rownr != 1 THEN price END), 
                MAX(CASE WHEN is_buy_order = 1 AND rownr = 2 THEN price END)) AS `95th%buy`,
       MIN(CASE WHEN is_buy_order = 0 THEN price END) AS sell,
       COALESCE(MAX(CASE WHEN is_buy_order = 0 AND 100.0 * rownr / num_rows >= 95 AND rownr != num_rows THEN price END), 
                MAX(CASE WHEN is_buy_order = 0 AND rownr = num_rows - 1 THEN price END)) AS `95th%sell`
FROM prices
GROUP BY type_id

If you can't use CTEs for some reason, you could write the CTE as a subquery:如果由于某种原因不能使用 CTE,则可以将 CTE 编写为子查询:

SELECT type_id,
       MAX(CASE WHEN is_buy_order = 1 THEN price END) AS buy,
       COALESCE(MIN(CASE WHEN is_buy_order = 1 AND 100.0 * (rownr - 1) / num_rows <= 5 AND rownr != 1 THEN price END), 
                MAX(CASE WHEN is_buy_order = 1 AND rownr = 2 THEN price END)) AS `95th%buy`,
       MIN(CASE WHEN is_buy_order = 0 THEN price END) AS sell,
       COALESCE(MAX(CASE WHEN is_buy_order = 0 AND 100.0 * rownr / num_rows >= 95 AND rownr != num_rows THEN price END), 
                MAX(CASE WHEN is_buy_order = 0 AND rownr = num_rows - 1 THEN price END)) AS `95th%sell`
FROM (
  SELECT type_id, price, is_buy_order,
         ROW_NUMBER() OVER (PARTITION BY type_id, is_buy_order ORDER BY price DESC) AS rownr,
         COUNT(*) OVER (PARTITION BY type_id, is_buy_order) AS num_rows
  FROM   orderbuffertest
) prices
GROUP BY type_id

Demo on dbfiddle dbfiddle 上的演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM