简体   繁体   中英

Remove Duplicates From JOIN

I'm currently learning databases in school right now, but unfortunately our teacher doesn't really enjoy helping or answering questions, at all. I'm currently working on a couple of Oracle DB exercises right now and I've come across one question I really don't know how to solve.

Table 1: Students

   ID      FAMILY NAME     FIRST NAME    BIRTH DATE      IM_DATE    FACULTY

  4711    Lehmann          Heini         13.03.89       01.09.08     I         
  4712    Huber            Sven          14.07.89       01.09.08     IWI       
  4713    Meier            Swantje       11.04.88       01.03.09     IWI       
  4714    Tunix            Ole           15.03.88       01.03.09     IWI       
  4715    Kannix           Peter         02.11.89       01.03.09     IWI       
  4716    Weissnix         Axel          15.12.88       01.03.09     IWI     

Table 2: LN

   ID    FKBEZ         VNR P_DATE        GRADE

  4711   DB1            1 02.02.08        4,7 
  4711   DB1            2 07.07.09          5 
  4711   PR1            1 28.01.09          3 
  4712   DB1            1 02.02.08        3,7 
  4713   DB1            1 02.02.08        1,7 
  4713   DB2            1 02.02.09        3,7 
  4714   PR1            1 28.01.09          2 
  4715   DB1            1 02.02.08          5 
  4711   DB2            1 14.07.09        1,3 
  4711   PR2            1 30.06.09        2,3 

Now, here are the questions.

Q1: Create an SQL query (JOIN) which will result in duplicate rows. Q2: Change your query from Q1 so now it won't show any duplicates.

My first 'problem' here is that I'm not 100% sure what the definition of duplicate is. Are duplicates rows with 100% identical content on ALL columns even if you haven't selected them in your SELECT command?

Example: Say I've created a query and I've chosen the columns 'Family Name' and 'Age' in my SELECT command and my result looks like this:

Family Name          Age

Miller               20
Miller               20

but these are actually two different people and have a different first name. Do these qualify as duplicates, since I haven't selected the first name and therefore it isn't showing, or doesn't it matter what I choose via SELECT and duplicate rows only qualify as duplicates if they're 100% identical?

Alright, back to my questions. For Q1 I've chosen a simple (INNER) JOIN query looking like this

SELECT S.ID, S.Family_Name, S.First_Name, LN.FKBEZ FROM Students S JOIN LN ON S.ID = LN.ID

FAMILY NAME               FIRST NAME               FKBEZ       (GRADE)

Lehmann                   Heini                     DB1          4,7 
Lehmann                   Heini                     DB1            5 
Lehmann                   Heini                     PR1            3 
Huber                     Sven                      DB1          3,7 
Meier                     Swantje                   DB1          1,7 
Meier                     Swantje                   DB2          3,7 
Tunix                     Ole                       PR1            2 
Kannix                    Peter                     DB1            5 
Lehmann                   Heini                     DB2          1,3 
Lehmann                   Heini                     PR2          2,3 

This is the result. I haven't SELECTed 'GRADE' in my query, but I've listed it for you, too, so you can understand my question a little better. Now, since I haven't selected 'GRADE' in my query, I'd consider row 1 + 2 to be duplicates, because they're identical in every visible column. I then went on to Q2 and used the exact same query, only this time using a NATURAL JOIN (since this would eliminate all the duplicate rows), but the result has been exactly the same.

So now my conclusion is that rows are only considered being duplicates if they're 100% identical on every visible and 'invisible' column. But now I'm actually completely stumped, because I have no idea how to solve Q1 + Q2.

It's important to know, that we're not supposed to solve these questions using DISTINCT or GROUP BY. Only (different kinds of) JOINS, INTERSECT, UNION and MINUS.

I guess you can see that I've put a lot of time and effort in composing this post so I'd appreciate it very much, if you guys could help me out on this one.

Thanks.

Try using left join. No corresponding record will have null value in the second table. Es. Select * from t1,t2 Where t1 left join t2. Non matching values on t2 will be null

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM