简体   繁体   中英

select certain records in oracle with latest date

So I currently have one way of doing what I want it goes like:

SELECT * FROM 
( SELECT  
e.*,
ROW_NUMBER() OVER (partition by USER_ID order by COPIED_TIMESTAMP DESC) r
FROM  
TABLE  e)
WHERE r = 1;

This works fine, but if the table becomes big I feel there may be performance issues. I was looking at something like this:

SELECT MAX(COPIED_TIMESTAMP), USER_ID
              FROM TABLE
              GROUP BY USER_ID

This also works, but is there a way to show all the information not just the timestamp and id. I'm quite new to SQL/Oracle queries and am a bit lost on this one.

but if the table becomes big I feel there may be performance issues.

I think the analytic query is just fine since it will scan the table only once . You could further optimize it's performance by creating indexes if required .

Your second query will return only the timestamp and user_id column, however, when you need the other columns, you would anyway do more than one table scan . So, it won't be a good idea.

If you are looking for other options like NOT EXISTS , it will still need more than one table scan.

Let's see a small test case:

Analytic query

SQL> EXPLAIN PLAN FOR SELECT * FROM
  2  (SELECT e.*,
  3    ROW_NUMBER() OVER (partition BY deptno order by hiredate DESC) r
  4  FROM emp e
  5  ) WHERE r = 1;

Explained.

SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 3291446077

---------------------------------------------------------------------------------
| Id  | Operation                | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |      |    14 |  1400 |     3   (0)| 00:00:01 |
|*  1 |  VIEW                    |      |    14 |  1400 |     3   (0)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|      |    14 |   518 |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | EMP  |    14 |   518 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("R"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "DEPTNO" ORDER BY
              INTERNAL_FUNCTION("HIREDATE") DESC )<=1)

17 rows selected.

SQL>

NOT EXISTS query

SQL> EXPLAIN PLAN FOR
  2  SELECT *
  3  FROM emp t1
  4  WHERE NOT EXISTS (SELECT 1 FROM emp t2
  5                    WHERE t1.deptno = t2.deptno
  6                      and t2.hiredate > t1.hiredate);

Explained.

SQL>
SQL> SELECT * FROM TABLE(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 3353202012

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    48 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN ANTI    |      |     1 |    48 |     6   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| EMP  |    14 |   518 |     3   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| EMP  |    14 |   154 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."DEPTNO"="T2"."DEPTNO")
       filter("T2"."HIREDATE">"T1"."HIREDATE")

16 rows selected.

SQL>

So, you can see that the analytic query does only one table scan .

Return rows where not exists same id with later timestamp:

SELECT USER_ID, COPIED_TIMESTAMP, other_column1, other_column2
FROM TABLE t1
WHERE NOT EXISTS (select 1 from table t2
                  where t1.user_id = t2.user_id
                    and t2.COPIED_TIMESTAMP > t1.COPIED_TIMESTAMP)

Analytic functions are your friend:

SELECT MAX( COPIED_TIMESTAMP ) KEEP ( DENSE_RANK LAST ORDER BY COPIED_TIMESTAMP ) AS COPIED_TIMESTAMP,
       MAX( other_column     ) KEEP ( DENSE_RANK LAST ORDER BY COPIED_TIMESTAMP ) AS other_column,
       MAX( other_column2    ) KEEP ( DENSE_RANK LAST ORDER BY COPIED_TIMESTAMP ) AS other_column2,
       USER_ID
FROM   TABLE_NAME
GROUP BY USER_ID

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM