简体   繁体   English

提高SQL查询性能(MAX日期)

[英]Increase SQL Query Performance (MAX date)

I was searching on how to get the latest occurences based on col1 and col2.我正在搜索如何根据 col1 和 col2 获取最新事件。

Let's suppose we have the following table (all rows needed are marked with *):假设我们有下表(所有需要的行都标有 *):

col1                   col2                    col3  
---------------------------------------------------------
002478                 ABC                 2019-08-23    *
002478                 ABC                 2019-05-14    
002588                 CVMG                2019-01-07    *
002588                 IP                  2019-01-31    *
002588                 MMG                 2019-09-04    *
002588                 MMG                 2019-08-28    
002588                 NUSA                2019-11-04    *
002588                 NUSA                2019-04-24    
002746                 IE                  2019-01-15    *
003467                 IE                  2020-01-10    
003467                 IE                  2020-03-13    *

I was able to get the latest occurences based on col1 and col2 with the following select.我能够使用以下 select 获得基于 col1 和 col2 的最新事件。

SELECT t.col1, 
       t.col2, 
       t.col3
FROM 
       teste t
WHERE t.col3 IN (SELECT max(a.col3) 
                 FROM teste a 
                 WHERE a.col1 = t.col1 AND a.col2 = t.col2)

In this example, it only takes about 10 ~ 7 ms to complete, but on my real database, it takes about 1 hour .在这个例子中,只需要大约10 ~ 7 ms即可完成,但在我的真实数据库上,大约需要1 hour

I removed all JOINS that I could and the minimum time I've reached was about 55 minutes .我删除了所有可能的JOINS ,我达到的最短时间约为55 minutes

As I'm using Progress, there's no window function (that I'm aware of) like partition by .当我使用 Progress 时,没有window function (我知道)像partition by .


There's another way to solve this problem?还有另一种方法可以解决这个问题吗? The only query I could think was on that "style".我能想到的唯一问题是关于那种“风格”。

Here's an SQL Fiddle with that example database.这是一个SQL Fiddle与该示例数据库。

Another way of writing the same query is to select the rows for which not excist a newer related row:编写相同查询的另一种方法是 select 不存在更新相关行的行:

SELECT t.col1, t.col2, t.col3
FROM teste t
WHERE NOT EXISTS
(
  SELECT NULL
  FROM teste t_newer
  WHERE t_newer.col1 = t.col1
    AND t_newer.col2 = t.col2
    AND t_newer.col3 > t.col3
);

This may be faster or slower or equally fast.这可能更快或更慢或同样快。 This depends on how your DBMS runs this internally.这取决于您的 DBMS 如何在内部运行它。

With either of the two queries the DBMS faces the task to quickly look up other rows with the same col1 and col2.对于这两个查询中的任何一个,DBMS 都面临着快速查找具有相同 col1 和 col2 的其他行的任务。 With only the table, the DBMS would have to sequentially read it again and again and again.只有表,DBMS 将不得不一次又一次地顺序读取它。 This is where indexes come into play.这就是索引发挥作用的地方。 You provide the DBMS with indexes, where it can look up where in the table are the matching rows.您为 DBMS 提供索引,它可以在其中查找表中匹配行的位置。

In your case you want an index an col1 and col2, in order to provide a means to find the related rows.在您的情况下,您需要一个索引 col1 和 col2,以便提供一种查找相关行的方法。 And you can also add col3, as this must be compared, too.您也可以添加 col3,因为这也必须进行比较。 Maybe it doesn't matter whether to start the index with col1 or col2, maybe it does.也许以 col1 或 col2 开始索引并不重要,也许确实如此。 How many different col1 are in the table, how many different col2?表中有多少不同的col1,有多少不同的col2? If one has just 5 different values and the other 5,000, then start the index with the one with 5,000 values, because for one value you will find fewer rows, ie get faster to the rows you are interested in.如果一个只有 5 个不同的值而另一个有 5,000 个,则从具有 5,000 个值的那个开始索引,因为对于一个值,您会发现更少的行,即更快地找到您感兴趣的行。

An index could then look like然后索引可能看起来像

create index idx on teste (col1, col2, col3);

The queries stay the same.查询保持不变。 The DBMS will look at your query and decide whether to use an index or not. DBMS 将查看您的查询并决定是否使用索引。 For the given queries I am sure it will use the index mentioned, because the queries are all about quickly looking up related rows.对于给定的查询,我确信它会使用提到的索引,因为查询都是关于快速查找相关行的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM