带分组依据的Oracle SQL max（）-重复值

Question

This is a simplified version of a query I created that gives me what I want (a list of all stud_id with the selected cpnt_id, whether there is a value in compl_dte or not, but only if the UserInput for Item is limited to only 1 record. 这是我创建的查询的简化版本，它提供了我想要的信息（具有选定cpnt_id的所有stud_id的列表，compl_dte中是否有值，但仅当Item的UserInput仅限于1条记录时。

select stud.*, lrnhist.* from
(select s.stud_id,
        i.cpnt_id
from student s, item i
where s.stud_id in [UserInput]
      c.cpnt_id in [UserInput]
) stud
left outer join
(select lh.stud_id,
        lh.cpnt_id,
        max(lh.compl_dte) compl_dte
from learnhist lh
where lh.cpnt_id in [UserInput]
group by lh.stud_id, lh.cpnt_id
) 
on stud.stud_id = lrnhist.stud_id

When it is run where UserInput specifies 2 or more Items, it returns the correct rows, but the returned value of compl_dte is always identical for each value of stud_id (because of the use of max(compl_dte) I'm sure). 在UserInput指定2个或更多项目的情况下运行该命令时，它将返回正确的行，但对于stud_id的每个值，compl_dte的返回值始终是相同的（因为我确定使用max（compl_dte））。 I'm just not sure what I need to do to make sure the returned compl_dte is the max for the stud_id/cpnt_id pair, not the max for stud_id regardless of cpnt_id. 我只是不确定我需要做什么以确保返回的compl_dte是stud_id / cpnt_id对的最大值，而不是stud_id的最大值，与cpnt_id无关。

Table values: 表值：

student
stud_id
1
2
3
4
item
cpnt_id
a
b
c
d
learnhist
stud_id cpnt_id compl_dte
1    a    5/5/2017
1    a    3/3/2016
1    b    10/10/2016
2    c    8/8/2016
3    b    2/2/2017

Results where UserInput is stud_id = * and cpnt_id = a: UserInput为stud_id = *和cpnt_id = a的结果：

stud_id cpnt_id compl_dte
1    a    5/5/2017
2    a
3    a
4    a

which is correct. 哪个是对的。 Results where UserInput is stud_id = * and cpnt_id = both a and b: UserInput为stud_id = *和cpnt_id = a和b的结果：

stud_id cpnt_id compl_dte
1    a    5/5/2017
1    b    5/5/2017
2    a
2    b
3    a    2/2/2017
3    b    2/2/2017
4    a
4    b

which is not what I'm looking for. 这不是我想要的。 Results I'm looking for in that case: 在这种情况下，我正在寻找结果：

stud_id cpnt_id compl_dte
1    a    5/5/2017
1    b    10/10/2016
2    a
2    b
3    a
3    b    2/2/2017
4    a
4    b

First post here, hopefully that all makes sense and I've asked in the right place! 希望在这里第一篇文章都有意义，我已经在正确的位置提出要求！

Answer 1

I believe the problem may be a missing join predicate between the STUD and LRNHIST inline views. 我认为问题可能是STUD和LRNHIST内联视图之间缺少LRNHIST谓词。
In the query you provided, the STUD inline view is a cartesian product between STUDENT and ITEM , which is then outer joined to the LRNHIST view that indeed has one CMPL_DTE per STUD_ID / CPNT_ID pair. 在您提供的查询中， STUD内联视图是STUDENT和ITEM之间的笛卡尔乘积，然后将其外部连接到LRNHIST视图，该视图确实每个STUD_ID / CPNT_ID对具有一个CMPL_DTE 。 But since the OUTER JOIN only predicates on STUD_ID , you'll also get matches where STUD.CPNT_ID <> LRNHST.CPNT_ID , providing extra rows. 但是由于OUTER JOIN仅基于STUD_ID谓词， STUD_ID您还将在STUD.CPNT_ID <> LRNHST.CPNT_ID获得匹配STUD.CPNT_ID <> LRNHST.CPNT_ID ，从而提供额外的行。

You break it down and look at the inline views individually: 您将其分解并分别查看内联视图：

For the STUD query: 对于STUD查询：

SELECT STUDENT.STUD_ID, ITEM.CPNT_ID FROM STUDENT 
CROSS JOIN ITEM
WHERE STUDENT.STUD_ID IN (1,2,3,4)
AND ITEM.CPNT_ID IN ('a','b','c','d');

Result: 结果：

stud_id     cpnt_id
1   a
1   b
1   c
1   d
2   a
2   b
2   c
2   d
... etc

So we can expect all these rows in the final query. 因此，我们可以在最终查询中期待所有这些行。

If you look at LRNHST individually: 如果您单独查看LRNHST ：

SELECT LEARNHIST.STUD_ID,
                LEARNHIST.CPNT_ID,
                 MAX(LEARNHIST.COMPL_DTE) COMPL_DTE
                 FROM LEARNHIST
                 GROUP BY LEARNHIST.STUD_ID, LEARNHIST.CPNT_ID;

There is indeed only one row per stud_id-cpnt_id pair (that exists in learnhist ): 实际上，每个stud_id-cpnt_id对仅存在一行（在learnhist中存在）：

stud_id     cpnt_id     compl_dte
1   b   October, 10 2016 00:00:00
1   a   May, 05 2017 00:00:00
3   b   February, 02 2017 00:00:00
2   c   August, 08 2016 00:00:00

Now if you join using only STUD_ID , you'll get a May 5th row for where STUD has 1 - a and LRNHST has 1 - a , but you'll also get a row where LRNHST has 1 -b , because there is no join predicate on CPNT_ID . 现在，如果仅使用STUD_ID加入， STUD_ID获得May 5th一行，其中STUD具有1 - a而LRNHST具有1 - a ，但是您还将获得其中LRNHST具有1 -b的行，因为没有连接基于CPNT_ID谓词。 If you select ALL five columns, you can see where the duplication comes in: 如果选择全部五列，则可以看到重复项的位置：

SELECT STUD.*, LRNHIST.* FROM (
SELECT STUDENT.STUD_ID, ITEM.CPNT_ID FROM STUDENT 
CROSS JOIN ITEM
WHERE STUDENT.STUD_ID IN (1,2,3,4)
AND ITEM.CPNT_ID IN ('a','b','c','d')) STUD
LEFT OUTER JOIN (SELECT LEARNHIST.STUD_ID,
                LEARNHIST.CPNT_ID,
                 MAX(LEARNHIST.COMPL_DTE) COMPL_DTE
                 FROM LEARNHIST
                 GROUP BY LEARNHIST.STUD_ID, LEARNHIST.CPNT_ID
                ) LRNHIST
ON STUD.STUD_ID = LRNHIST.STUD_ID
ORDER BY 1 ASC, 2 ASC, 3 ASC, 4 ASC, 5 ASC;

Result: 结果：

s_stud  s_cpnt  l_stud  l_cpnt  l_compl

1   a   1   a   May, 05 2017 00:00:00
1   a   1   b   October, 10 2016 00:00:00
1   b   1   a   May, 05 2017 00:00:00
1   b   1   b   October, 10 2016 00:00:00
1   c   1   a   May, 05 2017 00:00:00
1   c   1   b   October, 10 2016 00:00:00
1   d   1   a   May, 05 2017 00:00:00
1   d   1   b   October, 10 2016 00:00:00
2   a   2   c   August, 08 2016 00:00:00
... etc

Because this only joins on stud_id , both the Oct and May records are free to match STUD 's 1-a matches LRNHST 's 1 for in both its 1-a groud and 1-b group. 因为这只是在加入stud_id ，无论是Oct和May的记录可以随意搭配STUD的1-a匹配LRNHST的1在它的两个1-a对地表和1-b组。

Now if you join with CPNT_ID as well, only the LRNHST records that match Both CPNT_ID and STUD_ID will be returned. 现在，如果你加入CPNT_ID为好，只有LRNHST符合两者的记录CPNT_ID和STUD_ID将被退回。 ( May for 1-a and Oct for 1-b ) （对于1-a May Oct对于1-b 1-a Oct ）

SELECT STUD.STUD_ID, STUD.CPNT_ID, LRNHIST.COMPL_DTE FROM (
SELECT STUDENT.STUD_ID, ITEM.CPNT_ID FROM STUDENT 
CROSS JOIN ITEM
WHERE STUDENT.STUD_ID IN (1,2,3,4)
AND ITEM.CPNT_ID IN ('a','b','c','d')) STUD
LEFT OUTER JOIN (SELECT LEARNHIST.STUD_ID,
                LEARNHIST.CPNT_ID,
                 MAX(LEARNHIST.COMPL_DTE) COMPL_DTE
                 FROM LEARNHIST
                 GROUP BY LEARNHIST.STUD_ID, LEARNHIST.CPNT_ID
                ) LRNHIST
ON STUD.STUD_ID = LRNHIST.STUD_ID
AND STUD.CPNT_ID = LRNHIST.CPNT_ID
ORDER BY 1 ASC, 2 ASC;

Result: 结果：

stud_id     cpnt_id     compl_dte
1   a   May, 05 2017 00:00:00
1   b   October, 10 2016 00:00:00
1   c   (null)
1   d   (null)
2   a   (null)
2   b   (null)
2   c   August, 08 2016 00:00:00
2   d   (null)
... etc

Now you should have only one row per STUD_ID CPNT_ID pair, with nulls for compl_dte where no LRNHST record matches. 现在你应该有每次只有一排STUD_ID CPNT_ID对，用空的compl_dte没有地方LRNHST记录匹配。

Answer 2

Use a factored subquery. 使用分解的子查询。

WITH all_ids AS (
SELECT s.stud_id as stud_id,
       i.cpnt_id as cpnt_id
  FROM student s
CROSS JOIN item i )
SELECT stud_id, cpnt_id, max(lh.compl_dte) as compl_dte
  FROM all_ids
LEFT JOIN learnhist lh USING (stud_id, cpnt_id)
 WHERE cpnt_id IN ('a', 'b')
GROUP BY stud_id, cpnt_id
ORDER BY stud_id;

带分组依据的Oracle SQL max（）-重复值

问题描述

2 个解决方案

解决方案1
0 已采纳 2017-06-03 06:04:12

解决方案2
0 2017-06-03 06:24:54

带分组依据的Oracle SQL max（）-重复值

问题描述

2 个解决方案

解决方案1 0 已采纳 2017-06-03 06:04:12

解决方案2 0 2017-06-03 06:24:54

解决方案1
0 已采纳 2017-06-03 06:04:12

解决方案2
0 2017-06-03 06:24:54