简体   繁体   中英

Error with R dplyr left_join

So I've been trying to use left_join to get the columns of a new dataset onto my main dataset (called employee)

I've double checked the vector names and the cleaning that I've don't and nothing seems to work. Here is my code. Would appreciate any help.

job_codes <- read_csv("Quest_UMMS_JobCodes.csv")
job_codes <- job_codes %>%
clean_names() %>%
select(job_code, pos_desc = pos_des_desc)

job_codes$is_nurse <- str_detect(tolower(job_codes$pos_desc), "nurse") 

employee <- employee %>%
left_join(job_codes, by = "job_code")

The error I keep getting:Error in eval(substitute(expr), envir, enclos) : 'job_code' column not found in rhs, cannot join

here are the results of

names(job_code)
> names(job_codes)
[1] "job_code" "pos_desc" "is_nurse"

names(employee)
> names(employee)
 [1] "REC_NUM"             "ZIP"                 "STATE"              
 [4] "SEX"                 "EEO_CLASS"           "BIRTH_YEAR"         
 [7] "EMP_STATUS"          "PROCESS_LEVEL"       "DEPARTMENT"         
 [10] "JOB_CODE"            "UNION_CODE"          "SUPERVISOR"         
 [13] "DATE_HIRED"          "R_SHIFT"             "SALARY_CLASS"       
 [16] "EXEMPT_EMP"          "PAY_RATE"            "ADJ_HIRE_DATE"      
 [19] "ANNIVERS_DATE"       "TERM_DATE"           "NBR_FTE"            
 [22] "PENSION_PLAN"        "PAY_GRADE"           "SCHEDULE"           
 [25] "OT_PLAN_CODE"        "DECEASED"            "POSITION"           
 [28] "WORK_SCHED"          "SUPERVISOR_IND"      "FTE_TOTAL"          
 [31] "PRO_RATE_TOTAL"      "PRO_RATE_A_SAL"      "NEW_HIRE_DATE"      
 [34] "COUNTY"              "FST_DAY_WORKED"      "date_hired"         
 [37] "date_hired_adj"      "term_date"           "employment_duration"
 [40] "current"             "age"                 "emp_duration_years" 
 [43] "DESCRIPTION.x"       "PAY_STATUS.x"        "DESCRIPTION.y"      
 [46] "PAY_STATUS.y"      

Now, after the OP has added the column names of both tables in the Q, it is evident that the columns to join on are written in different ways (upper vs lower case).

If the column names are different, help("left_join") suggests:

To join by different variables on x and y use a named vector. For example, by = c("a" = "b") will match xa to yb

So, in this case it should read

employee <- employee %>% left_join(job_codes, by = c("JOB_CODE" = "job_code"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM