简体   繁体   English

将两个数据框与 left_join 合并会在“右”列中产生 NA

[英]Merging two dataframes with left_join produces NAs in 'right' columns

When I use dplyr::left_join to combine 2 dataframes, all of the 'right' dataframe columns are filled with NA values.当我使用 dplyr::left_join 组合 2 个数据帧时,所有“右”dataframe 列都填充有 NA 值。

I have checked multiple other answers on StackOverflow to try and eliminate the source of my mistake, including https://stackoverflow.com/questions/35016377/dplyrleft-join-produce-na-values-for-new-joined-columns]我已经检查了 StackOverflow 上的多个其他答案,以尝试消除我的错误来源,包括https://stackoverflow.com/questions/35016377/dplyrleft-join-produce-na-values-for-new-joined-columns]

However, the answers already available on Stack have not been able to fix my issue.但是,Stack 上已有的答案无法解决我的问题。

Here is my reproducible code这是我的可重现代码

# Libraries
library('remotes')
library("tidytuesdayR")
library('ggplot2')
library("tidyverse")

# Load data
tuesdata <- tidytuesdayR::tt_load('2021-01-19')
gender <- tuesdata$gender
crops <-tuesdata$crops
households <- tuesdata$households

#rename crops column
colnames(crops)[1]<-"County"
# make County columns into characters
gender$County <- as.character(gender$County)
crops$County <- as.character(crops$County)
households$County <- as.character(households$County)
# Change "total" cell to "kenya"
gender[1, 1] <- "Kenya"
# All caps to Title case
crops$County<-str_to_title(crops$County)

# left_join households and crops column
df<- left_join(households, crops, by=c("County"="County")) 

When I run this, every single 'crops' column is filled with NAs.当我运行它时,每个“crops”列都充满了 NA。 My overall goal to to merge all three datasets (crops, households, and gender) by county name in Kenya.我的总体目标是按肯尼亚县名合并所有三个数据集(作物、家庭和性别)。

I could use some assitance.我可以使用一些帮助。 Thanks.谢谢。

You need to trim the County variable in the households df - there are extra spaces so it is matching incorrectly with the crops df.您需要修剪households df 中的County变量 - 有多余的空格,因此它与crops df 不正确匹配。 Eg:例如:

"Kenya   "
"Mombasa   "

Adding this extra line before the left_join fixes it:left_join修复它之前添加这个额外的行:

households$County <- stringr::str_trim(households$County)
df <- left_join(households, crops) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM