I have the following join a table to the most recent record for a given EMPLOYE_ID
and I am wondering if there is a more efficient/faster way of retrieving the most recent record, what would be the best way?
SELECT * FROM EMPLOYEE
WHERE NOT EXISTS (
SELECT 1
FROM EMPLOYEE D
JOIN EMPLOYEE_HISTORY E
ON E.EMPLOYEE_ID = D.EMPLOYEE_ID
AND E.CREATE_DATE IN (SELECT MAX(CREATE_DATE)
FROM EMPLOYEE_HISTORY
WHERE EMPLOYEE_ID = D.EMPLOYEE_ID)
)
When I compared the explain plan to the following query it seems the below way is MORE costly.
SELECT *
FROM EMPLOYEE
WHERE NOT EXISTS
(SELECT 1
FROM EMPLOYEE D
JOIN (
SELECT E.*
FROM EMPLOYEE_HISTORY E
INNER JOIN (
SELECT EMPLOYEE_ID
, MAX(CREATE_DATE) max_date
FROM EMPLOYEE_HISTORY E2
GROUP BY EMPLOYEE_ID
) EE
ON EE.EMPLOYEE_ID = E.EMPLOYEE_ID
AND EE.max_date = E.CREATE_DATE
) A
ON A.EMPLOYEE_ID = D.EMPLOYEE_ID
AND ROWNUM = 1)
So does that mean it is indeed better?
There is no index on CREATE_DATE, however the PK is on EMPLOYEE_ID, CREATE_DATE
I would write the query using =
rather than IN
:
SELECT 1
FROM EMPLOYEE E JOIN
EMPLOYEE_HISTORY EH
ON EH.EMPLOYEE_ID = E.EMPLOYEE_ID AND
EH.CREATE_DATE = (SELECT MAX(EH2.CREATE_DATE)
FROM EMPLOYEE_HISTORY EH2
WHERE EH2.EMPLOYEE_ID = EH.EMPLOYEE_ID
);
IN
is more general than =
for the comparison.
Your primary key index should be used for the subquery, which should make it pretty fast.
Assuming that you actually do want to return actual columns, then I'm not sure if there is a way to make this faster.
If you really are selecting only 1
, then forget the most recent record and just use EXISTS
:
SELECT 1
FROM EMPLOYEE E
WHERE EXISTS (SELECT 1
FROM EMPLOYEE_HISTORY EH2
WHERE EH2.EMPLOYEE_ID = E.EMPLOYEE_ID
);
The only additional condition your query checks for is that CREATE_DATE
is not NULL, but I'm guessing that is always true anyway.
Use the RANK
(or DENSE_RANK
or ROW_NUMBER
) analytic function:
SELECT 1
FROM EMPLOYEE E
JOIN (
SELECT *
FROM (
SELECT H.*,
RANK() OVER ( PARTITION BY EMPLOYEE_ID ORDER BY CREATE_DATE DESC ) AS rnk
FROM EMPLOYEE_HISTORY H
)
WHERE rnk = 1
) H
ON H.EMPLOYEE_ID = E.EMPLOYEE_ID
If the CREATE_DATE of the EMPLOYEE must be after the maximum CREATE_DATE for that EMPLOYEE_ID in EMPLOYEE_HISTORY?
Then for that EMPLOYEE_ID, there doesn't exist an equal or higher CREATE_DATE in EMPLOYEE_HISTORY.
SELECT *
FROM EMPLOYEE Emp
WHERE NOT EXISTS (
SELECT 1
FROM EMPLOYEE_HISTORY Hist
WHERE Hist.EMPLOYEE_ID = Emp.EMPLOYEE_ID
AND Hist.CREATE_DATE >= Emp.CREATE_DATE
)
Test here
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.