选择单个最大值

Question

Say I need to pull data from several tables like so: 假设我需要像这样从几个表中提取数据：

item 1 - from table 1
item 2 - from table 1
item 3 - from table 1 - but select only max value of item 3 from table 1
item 4 - from table 2 - but select only max value of item 4 from table 2

My query is pretty simple: 我的查询非常简单：

select
    a.item 1,
    a.item 2,
    b.item 3,
    c.item 4
from table 1 a
left join (select b.key_item, max(item 3) from table 1, group by key_item) b on a.key_item = b.key_item
left join (select c.key_item, max(item 4) from table 2, group by key_item) c on c.key_item = a.key_item

I am not sure if my methodology of pulling just a single max item from a table is the most efficient. 我不确定从表中仅提取单个最大项目的方法是否最有效。 Assume both tables are over a million rows. 假设两个表都超过一百万行。 my actual sql run forever using this sql setup. 我的实际sql永远使用此sql设置运行。

EDIT: I changed the group by clause to reflect comments made. 编辑：我更改了group by子句以反映所作的评论。 I hope it makes a bit of sense now? 我希望这现在有意义吗？

Answer 1

Your best bet is to add an index on table1 and table2 , as follows: 最好的选择是在table1和table2上添加索引，如下所示：

ALTER TABLE table1
ADD INDEX `GoodIndexName1` (`key_item`,`item3`)

ALTER TABLE table2
ADD INDEX `GoodIndexName2` (`key_item`,`item4`)

This will allow you to use queries as described in the MySQL documentation for finding the rows holding the group-wise maximum, which appears to be what you are looking for. 这将允许您使用MySQL文档中描述的查询来查找包含按组的最大值的行，这似乎是您要查找的行。

Your original (edited) query should work: 您原始的（编辑过的）查询应该可以工作：

select
    a.item1,
    a.item2,
    b.item3,
    c.item4
from table1 a
LEFT OUTER JOIN (
    SELECT 
    b.key_item, 
    MAX(item3) AS item3
    FROM table1
    GROUP BY key_item
) b 
ON a.key_item = b.key_item
LEFT OUTER JOIN (
    SELECT 
    c.key_item, 
    MAX(item4) 
    FROM table2
    GROUP BY key_item
) c 
ON c.key_item = a.key_item

and if that performs slowly after adding the indexes, try the following too: 如果在添加索引后执行缓慢，请尝试以下操作：

SELECT
    a.item1,
    a.item2,
    b.item3,
    c.item4
FROM table1 a
LEFT OUTER JOIN table1 b
ON b.key_item = a.key_item
LEFT OUTER JOIN table1 larger_b
ON larger_b.key_item = b.key_item
AND larger_b.item3 > b.item_3
LEFT OUTER JOIN table2 c
ON c.key_item = a.key_item
LEFT OUTER JOIN table2 larger_c
ON larger_c.key_item = c.key_item
AND larger_c.item4 > c.item4
WHERE larger_b.key_item IS NULL
AND larger_c.key_item IS NULL

(I have modified the table and column names only slightly, so that they conform to correct MySQL syntax. ) （我仅对表名和列名进行了少许修改，以使其符合正确的MySQL语法。）

I work with queries that use the above structure all the time, and they perform very efficiently with indexes like the one I provided. 我一直在使用使用上述结构的查询，并且它们使用我提供的索引非常有效地执行。

That said, usually I am using INNER JOINs on the b and c tables, but I don't see why your query should have any issues. 就是说，通常我在b和c表上使用INNER JOINs，但是我不明白为什么您的查询应该有任何问题。

If you do experience performance problems still, report the data types of the key_item columns for each table, as if you try to join on different data types, you will generally get poor performance. 如果确实仍然遇到性能问题，请报告每个表的key_item列的数据类型，就好像您尝试连接不同的数据类型一样，通常会得到较差的性能。

选择单个最大值

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-11-09 03:30:18

选择单个最大值

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-11-09 03:30:18

解决方案1
1 已采纳 2015-11-09 03:30:18