简体   繁体   English

单个SQL查询可返回不同列的唯一值的最大值

[英]Single SQL Query to return maximum values of unique values of different column

I'm attempting to retrieve the maximum stellarMass property for each unique galaxyId column value. 我正在尝试为每个唯一的galaxyId列值检索最大的stellarMass属性。 Let me break it down. 让我分解一下。

Firstly, the following query returns a list of associated objects' IDs and stellarMass that I'm interested in for each of the DES.galaxyId's in the list. 首先,以下查询返回列表中每个DES.galaxyId感兴趣的关联对象ID和stellarMass的列表。

SELECT DES.galaxyId as descID,
   PROG.galaxyId as progID,
   PROG.stellarMass as progStellarMass
FROM Guo2010a..mMR PROG, Guo2010a..mMR DES
WHERE DES.galaxyId in (0,2,5) 
   AND PROG.galaxyId BETWEEN DES.galaxyId AND DES.lastprogenitorId
   AND PROG.snapnum = 48

This returns a table of the form 这将返回一个表格形式

-------------------------------------------------
|   descID   |   progID   |   progStellarMass   |
-------------------------------------------------
|   0        |   34       |   8.3345            |
|   0        |   38       |   18.3345           |
|   2        |   198      |   80.3345           |
|   5        |   99       |   6.3345            |
|   5        |   8        |   3.3345            |
-------------------------------------------------

So for each DES.galaxyId/descID in (0,2,5...), multiple results can be returned. 因此,对于(0,2,5 ...)中的每个DES.galaxyId / descID,可以返回多个结果。 What I want to do is, from this result, select the result with the max(progStellarMass) for each unique descID. 我想要做的是,从此结果中,为每个唯一的descID选择带有max(progStellarMass)的结果。 And I need to do this in a single query. 我需要在单个查询中执行此操作。

So, what I'm wanting would return the following table: 因此,我想要的将返回下表:

----------------------------------------------------
|   descID   |   progID   |   MAXprogStellarMass   |
----------------------------------------------------
|   0        |   38       |   18.3345              |
|   2        |   198      |   80.3345              |
|   5        |   99       |   6.3345               |
----------------------------------------------------

Any help would be greatly appreciated. 任何帮助将不胜感激。 The reason I'm opening a new question is because of this extra query I run first to get the table of data I need to work on. 我要提出一个新问题的原因是因为我首先运行了这个额外的查询来获取需要处理的数据表。

SELECT descID,progID,progStellarMass
FROM
(
    SELECT RANK() OVER (PARTITION BY DES.galaxyId  ORDER BY PROG.stellarMass DESC) AS RankID, DES.galaxyId as descID,
       PROG.galaxyId as progID,
       PROG.stellarMass as progStellarMass
    FROM Guo2010a..mMR PROG, Guo2010a..mMR DES
    WHERE DES.galaxyId in (0,2,5) 
       AND PROG.galaxyId BETWEEN DES.galaxyId AND DES.lastprogenitorId
       AND PROG.snapnum = 48
) AS WRAP
WHERE RankID = 1

I have a solution that might not be the best but at least it should works(didn't actually ran it). 我有一个可能不是最好的解决方案,但至少应该可以(实际上没有运行过)。 Be careful as it uses sub-query, check the explain carefully. 使用子查询时要小心,请仔细检查说明。

SELECT t1.descID, PROG.galaxyId as progID, MAXprogStellarMass 
FROM Guo2010a..mMR PROG, Guo2010a..mMR DES
INNER JOIN 
(SELECT DES.galaxyId as descID,
   max(PROG.stellarMass) as MAXprogStellarMass
FROM Guo2010a..mMR PROG, Guo2010a..mMR DES
WHERE DES.galaxyId in (0,2,5) 
   AND PROG.galaxyId BETWEEN DES.galaxyId AND DES.lastprogenitorId
   AND PROG.snapnum = 48 
GROUP BY DES.galaxyId ) as t1
ON (t1.descID = DES.galaxyId)
WHERE MAXprogStellarMass = PROG.stellarMass

Tips : there is a way to "force" subquery to always runs before the main query. 提示:有一种方法可以“强制”子查询始终在主查询之前运行。 It is done by surrounding the subquery with an extra select * from( ) 这是通过用额外的select * from()包围子查询来完成的

select a,b from (select * from (select a,b from table1 where requirement = "matched") as t1) as t2) where a > b;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM