[英]How do I replace values across multiple columns in a data-frame with values from a second column, based on a match with a third column using R?
I am working with a single dataframe in R containing the following char columns and values.我正在使用 R 中的单个 dataframe 包含以下字符列和值。
C1<-c("1","2","3","4","5")
C2<-c("x", "t", "u", "r", "j")
C3<-c("2","5","3","1","4")
C4<-c("3","1","NA", "2","5")
df<-data.frame(C1,C2,C3,C4)
I am trying to write code that will replace values in C3 and C4 as follows:我正在尝试编写将替换 C3 和 C4 中的值的代码,如下所示:
The initial dataframe looks like this:最初的 dataframe 如下所示:
The final dataframe should look like this:最终的 dataframe 应如下所示:
I've yet to come up with code (base R or Dplyr) that will accomplish this task.我还没有想出可以完成这项任务的代码(基础 R 或 Dplyr)。 If anyone can lend assistance, I would really appreciate it.
如果有人可以提供帮助,我将不胜感激。
Thanks!谢谢!
This is a new df that I've tried to manipulate with the code provided by respondents (eg, df[c("C3", "C4")] <- lapply(df[c("C3", "C4")], function(x) df$C2[match(x, df$C1)])).这是我尝试使用受访者提供的代码来操作的新 df(例如 df[c("C3", "C4")] <- lapply(df[c("C3", "C4") ],函数(x)df$C2[匹配(x,df$C1)]))。
I am returning all NA's for C3 C4 and cannot understand why.我要退回 C3 C4 的所有 NA,但不明白为什么。 There are matches between C3 and C1.
C3 和 C1 之间存在匹配。
We can use match
我们可以使用
match
df[c("C3", "C4")] <- lapply(df[c("C3", "C4")], function(x) df$C2[match(x, df$C1)])
I also used match
, but split it up into two different statements to make it more clear what was going on:我也使用了
match
,但将其拆分为两个不同的语句,以便更清楚地了解发生了什么:
# Create sample data
C1<-c("1","2","3","4","5")
C2<-c("x", "t", "u", "r", "j")
C3<-c("2","5","3","1","4")
C4<-c("3","1","NA", "2","5")
df<-data.frame(C1,C2,C3,C4)
# Make replacements
df$C3_mod <- ifelse(is.na(df$C3), df$C3, df$C2[match(df$C3, df$C1)])
df$C4_mod <- ifelse(is.na(df$C4), df$C4, df$C2[match(df$C4, df$C1)])
# View results
df
# C1 C2 C3 C4 C3_mod C4_mod
# 1 1 x 2 3 t u
# 2 2 t 5 1 j x
# 3 3 u 3 NA u <NA>
# 4 4 r 1 2 x t
# 5 5 j 4 5 r j
Using match
with matrix.使用与矩阵
match
。
cols <- c('C3', 'C4')
df[cols] <- df$C2[match(as.matrix(df[cols]), df$C1)]
df
# C1 C2 C3 C4
#1 1 x t u
#2 2 t j x
#3 3 u u <NA>
#4 4 r x t
#5 5 j r j
I solved the issue of my NA values.我解决了我的 NA 值的问题。 It turns out that I had whitespaces in the column values that I hadn't accounted for.
事实证明,我没有考虑到列值中有空格。 Again, thanks to everyone for their responses.
再次感谢大家的回复。 I learned a lot in the process.
在这个过程中我学到了很多。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.