简体   繁体   中英

Using left_join based on two columns does not work in R

I am trying to left_join a dataframe containing cld letters to the main dataframe. The join should be based on two columns. Each row has just one column with values because they come from different models.

Here the main dataframe:

main_df<-structure(list(crop = c("B", "B", "B", "B", "B", "B", "B", "B", 
"B", "A", "A", "A", "A", "A", "A", "A", "A", "A", "C", "C", "C", 
"C", "C", "C", "C", "C", "C"), till = c("X", "X", "X", "Y", "Y", 
"Y", "Z", "Z", "Z", "X", "X", "X", "Y", "Y", "Y", "Z", "Z", "Z", 
"X", "X", "X", "Y", "Y", "Y", "Z", "Z", "Z"), GW = c("250", "100", 
"500", "250", "100", "500", "250", "100", "500", "250", "100", 
"500", "250", "100", "500", "250", "100", "500", "250", "100", 
"500", "250", "100", "500", "250", "100", "500"), dm = c(12.492780040282, 
21.2330087520355, 9.08920058839951, 9.6579014126203, 32.3208262815535, 
10.6259628492133, 6.13043260006999, 49.6628012967183, 28.8896483162288, 
14.8279966222885, 11.5590504143496, 23.7186742486867, 22.8598403733191, 
8.59025110732551, 20.8781551231343, 34.6812252760796, 25.056901935212, 
11.9791387922734, 2.98603520945085, 20.768615091017, 5.68987327841495, 
35.6382624005007, 24.1315098558383, 32.3442728999024, 35.5586316123229, 
8.36256345081252, 6.06606303991154)), row.names = c(NA, -27L), class = c("tbl_df", 
"tbl", "data.frame"))

while here the cld dataframe:

cld_df<-structure(list(till = c("Z", "Y", "X", NA, NA, NA, "Z", "X", 
"Y"), .group = c(" a ", "  b", "  b", " a ", " ab", "  b", " a ", 
" ab", "  b"), crop = c("B", "B", "B", "C", "C", "C", "A", "A", 
"A"), GW = c(NA, NA, NA, "500", "250", "100", NA, NA, NA)), row.names = c(NA, 
-9L), class = c("tbl_df", "tbl", "data.frame"))

The cld are inside the column named ".group". Both keeping columns as character and converting them to factor lead to an empty column:

main_df%>% mutate(till=as.factor(till),GW=as.factor(GW),crop=as.factor(crop)) %>%  
left_join(cld_df%>%mutate(till=as.factor(till),GW=as.factor(GW),crop=as.factor(crop)))

Is this what you need?

main_df %>%
  left_join(select(cld_df, crop, till, .group), by = c("crop", "till")) %>%
  left_join(select(cld_df, crop, GW, .group), by = c("crop", "GW")) %>%
  mutate(.group = coalesce(.group.x, .group.y)) %>%
  select(-.group.x, -.group.y) %>%
  print(n=99)
# # A tibble: 27 x 5
#    crop  till  GW       dm .group
#    <chr> <chr> <chr> <dbl> <chr> 
#  1 B     X     250   12.5  "  b" 
#  2 B     X     100   21.2  "  b" 
#  3 B     X     500    9.09 "  b" 
#  4 B     Y     250    9.66 "  b" 
#  5 B     Y     100   32.3  "  b" 
#  6 B     Y     500   10.6  "  b" 
#  7 B     Z     250    6.13 " a " 
#  8 B     Z     100   49.7  " a " 
#  9 B     Z     500   28.9  " a " 
# 10 A     X     250   14.8  " ab" 
# 11 A     X     100   11.6  " ab" 
# 12 A     X     500   23.7  " ab" 
# 13 A     Y     250   22.9  "  b" 
# 14 A     Y     100    8.59 "  b" 
# 15 A     Y     500   20.9  "  b" 
# 16 A     Z     250   34.7  " a " 
# 17 A     Z     100   25.1  " a " 
# 18 A     Z     500   12.0  " a " 
# 19 C     X     250    2.99 " ab" 
# 20 C     X     100   20.8  "  b" 
# 21 C     X     500    5.69 " a " 
# 22 C     Y     250   35.6  " ab" 
# 23 C     Y     100   24.1  "  b" 
# 24 C     Y     500   32.3  " a " 
# 25 C     Z     250   35.6  " ab" 
# 26 C     Z     100    8.36 "  b" 
# 27 C     Z     500    6.07 " a " 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM