简体   繁体   English

SQL:重用函数结果而不使用子查询的查询

[英]SQL: Reuse function result in query without using sub-query

In a MySQL DB table that stores sale orders, I have a LastReviewed column that holds the last date and time when the sale order was modified (type timestamp , default value CURRENT_TIMESTAMP ). 在存储销售订单的MySQL数据库表中,我有一个LastReviewed列,其中保存了修改销售订单的最后日期和时间(类型timestamp ,默认值CURRENT_TIMESTAMP )。 I'd like to plot the number of sales that were modified each day, for the last 90 days, for a particular user. 我想绘制特定用户在过去90天内每天修改的销售数量。

I'm trying to craft a SELECT that returns the number of days since LastReviewed date, and how many records fall within that range. 我正在尝试制作一个SELECT ,该SELECT返回自LastReviewed日期以来的天数,以及该范围内有多少条记录。 Below is my query, which works just fine: 以下是我的查询,效果很好:

SELECT DATEDIFF(CURDATE(), LastReviewed) AS days, COUNT(*) AS number FROM sales
WHERE UserID=123 AND DATEDIFF(CURDATE(),LastReviewed)<=90
GROUP BY days
ORDER BY days ASC

Notice that I am computing the DATEDIFF() as well as CURDATE() multiple times for each record. 请注意,我为每条记录多次计算DATEDIFF()CURDATE() This seems really ineffective, so I'd like to know how I can reuse the results of the previous computation. 这似乎确实无效,所以我想知道如何重用之前的计算结果。 The first thing I tried was: 我尝试的第一件事是:

SELECT DATEDIFF(CURDATE(), LastReviewed) AS days, COUNT(*) AS number FROM sales
WHERE UserID=123 AND days<=90
GROUP BY days
ORDER BY days ASC

Error: Unknown column 'days' in 'where clause' . 错误: Unknown column 'days' in 'where clause' So I started to look around the net. 于是我开始环顾四周。 Based on another discussion ( Can I reuse a calculated field in a SELECT query? ), I next tried the following: 基于另一个讨论( 我可以在SELECT查询中重用计算字段吗? ),接下来我尝试了以下方法:

SELECT DATEDIFF(CURDATE(), LastReviewed) AS days, COUNT(*) AS number FROM sales
WHERE UserID=123 AND (SELECT days)<=90
GROUP BY days
ORDER BY days ASC

Error: Unknown column 'days' in 'field list' . 错误: Unknown column 'days' in 'field list' I'm also tried the following: 我也尝试了以下方法:

SELECT @days := DATEDIFF(CURDATE(), LastReviewed) AS days, 
       COUNT(*) AS number FROM sales
WHERE UserID=123 AND @days <=90
GROUP BY days
ORDER BY days ASC

The query returns zero result, so @days<=90 seems to return false even though if I put it in the SELECT clause and remove the WHERE clause, I can see some results with @days values below 90. 该查询返回零结果,因此,即使我将其放入SELECT子句并删除WHERE子句, @days<=90似乎仍返回false ,我可以看到@days值低于90的一些结果。

I've gotten things to work by using a sub-query: 我已经通过使用子查询来使事情工作:

SELECT * FROM (
  SELECT DATEDIFF(CURDATE(),LastReviewed) AS sales , 
         COUNT(*) AS number FROM sales
  WHERE UserID=123
  GROUP BY days
) AS t
WHERE days<=90
ORDER BY days ASC

However I odn't know whether it's the most efficient way. 但是我不知道这是否是最有效的方法。 Not to mention that even this solution computes CURDATE() once per record even though its value will be the same from the start to the end of the query. 更不用说即使此解决方案为每条记录计算一次CURDATE() ,即使其值从查询开始到结束都相同。 Isn't that wasteful? 那不是浪费吗? Am I overthinking this? 我在想这个吗? Help would be welcome. 欢迎提供帮助。

Note: Mods, should this be on CodeReview? 注意:Mod,这应该在CodeReview上吗? I posted here because the code I'm trying to use doesn't actually work 我在这里发布的原因是我尝试使用的代码实际上无法正常工作

There are actually two problems with your question. 您的问题实际上有两个问题。

First, you're overlooking the fact that WHERE precedes SELECT . 首先,您忽略了WHERESELECT之前的事实。 When the server evaluates WHERE <expression> , it then already knows the value of the calculations done to evaluate <expression> and can use those for SELECT . 当服务器评估WHERE <expression> ,它便已经知道为评估<expression>所做的计算的值,并且可以将这些值用于SELECT

Worse than that, though, you should almost never write a query that uses a column as an argument to a function, since that usually requires the server to evaluate the expression for each row. 但是,更糟糕的是,您几乎绝不应该编写将列用作函数参数的查询,因为这通常需要服务器为每一行评估表达式。

Instead, you should use this: 相反,您应该使用以下代码:

WHERE LastReviewed < DATE_SUB(CURDATE(), INTERVAL 90 DAY)

The optimizer will see this and get all excited, because DATE_SUB(CURDATE(), INTERVAL 90 DAY) can be resolved to a constant, which can be used on one side of a < comparison, which means that if an index exists with LastReviewed as the leftmost relevant column, then the server can immediately eliminate all of the rows with LastReviewed >= that constant value, using the index. 优化器将看到这一点并为之兴奋,因为DATE_SUB(CURDATE(), INTERVAL 90 DAY)可以解析为一个常量,可以在<比较的一侧使用,这意味着如果存在LastReviewed为最左侧的相关列,则服务器可以使用索引立即消除LastReviewed >=该常数值的所有行。

Then DATEDIFF(CURDATE(), LastReviewed) AS days (still needed for SELECT ) will only be evaluated against the rows we already know we want. 然后DATEDIFF(CURDATE(), LastReviewed) AS days将仅针对我们已经知道想要的行来评估DATEDIFF(CURDATE(), LastReviewed) AS days (仍然需要SELECT )。

Add a single index on (UserID, LastReviewed) and the server will be able to pinpoint exactly the relevant rows extremely quickly. 在(UserID,LastReviewed)上添加单个索引,服务器将能够非常快速地精确定位相关行。

Builtin functions are much less costly than, say, fetching rows. 内置函数比获取行要便宜得多。

You could get a lot more performance improvement with the following 'composite' index: 使用下面的“ composite”索引,您可以获得更多的性能改进:

INDEX(UserID, LastReviewed)

and change to 更改为

WHERE UserID=123
  AND LastReviewed >= CURRENT_DATE() - INTERVAL 90 DAY

Your formulation is 'hiding' LastRevieded in a function call, making it unusable in an index. 您的公式在函数调用中“隐藏” LastRevieded ,使其在索引中不可用。

If you are still not satisfied with that improvement, then consider a nightly query that computes yesterday's statistics and puts them in a "Summary table". 如果您仍然对该改进不满意,请考虑每晚查询一次,该查询计算昨天的统计信息并将其放在“汇总表”中。 From there, the SELECT you mentioned can run even faster. 从那里,您提到的SELECT可以运行得更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM