简体   繁体   中英

R) Using join in R

Given database is down below,

> dbReadTable(jamesdb, "EMPLOYEE")
  EMP_NO NI_NO  NAME AGE DEPT_NO
1     E1   123 SMITH  21      D1
2     E2   159 SMITH  31      D1
3     E3  5432 BROWN  65      D2
4     E5  7654 GREEN  52      D3

> dbReadTable(jamesdb, "DEPARTMENT")
  DEPT_NO     NAME MANAGER
1      D1 Accounts      E1
2      D2   Stores      E3
3      D3    Sales      E5

> dbReadTable(jamesdb, "PRODUCT")
  PROD_NO   NAME COLOR
1      p1  PANTS  BLUE
2      p2  PANTS KHAKI
3      p3  SOCKS GREEN
4      p4  SOCKS WHITE
5      p5 SHIRTS WHITE

> dbReadTable(jamesdb, "STOCK_TOTAL")
  PROD_NO QUANTITY
1      p1     2000
2      p2     1000
3      p3     1500
4      p4      200
5      p5      800

And down below is what I got so far but I think I have a misunderstanding of using join. How should I fix them?

  1. Retrieve the employment number of the sales department manager.

     dbGetQuery(jamesdb, 'SELECT EMPLOYEE.EMP_NO FROM DEPARTMENT JOIN EMPLOYEE WHERE DEPARTMENT.NAME = "Sales"')
  2. Who works in Department D2?

     dbGetQuery(jamesdb, 'SELECT MANAGER FROM DEPARTMENT WHERE DEPT_NO = "D2"')
  3. How many white-colored products are in stock?

     dbGetQuery(jamesdb, 'SELECT SUM(QUANTITY) FROM PRODUCT JOIN STOCK_TOTAL WHERE PRODUCT.COLOR = "WHITE"')

Joins are typically done matching one (or more) fields from one table with a corresponding field(s) of another table. For instance, I'm inferring that DEPARTMENT.MANAGER is actually a foreign key to EMPLOYEE.EMP_NO , so when you join, you should be very specific about that relationship:

SELECT e.EMP_NO
FROM DEPARTMENT d
  LEFT JOIN EMPLOYEE e on d.MANAGER = e.EMP_NUM
WHERE d.NAME = "Sales"

Notes:

  • Many databases allow you to be sloppy, where they will infer field associations (foreign keys) based on common field names. First, I don't like allowing that inference; second, it doesn't work here.

  • I personally prefer to be explicit about the type of join, whether left join , inner join , etc. It's a style, you may choose just join if you prefer.

  • I'm introducing table aliases here ( d and e ), a way to shorten long table names. However, they are stylistic, not required.

  • I personally dislike databases that lack a dictionary of foreign keys and have unintuitive names to associate them. For instance, I'm inferring from the contents of the tables that DEPARTMENT.MANAGER is linked to EMPLOYEE.EMP_NUM . If I'm wrong on this inference, then answers below are likely skewed.

For your first question, though, I don't know why you need a join: since MANAGER is already the employee number, this should be simply

select d.MANAGER
from DEPARTMENT d
where d.NAME='Sales'

Similarly, your second question needs no join.

select e.*
from EMPLOYEE e
where e.DEPT_NO='D2'

The last one needs a join, and can be done in a number of ways. One such is:

select sum(case when st.Quantity > 0 then 1 else 0 end) as Count
from STOCK_TOTAL st
  left join PRODUCT pr on st.PROD_NO=pr.PROD_NO
where pr.COLOR='WHITE'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM