简体   繁体   English

如何根据使用 R 与第三列的匹配,将数据框中多列的值替换为第二列中的值?

[英]How do I replace values across multiple columns in a data-frame with values from a second column, based on a match with a third column using R?

I am working with a single dataframe in R containing the following char columns and values.我正在使用 R 中的单个 dataframe 包含以下字符列和值。

C1<-c("1","2","3","4","5")
C2<-c("x", "t", "u", "r", "j")
C3<-c("2","5","3","1","4")
C4<-c("3","1","NA", "2","5")
df<-data.frame(C1,C2,C3,C4)

I am trying to write code that will replace values in C3 and C4 as follows:我正在尝试编写将替换 C3 和 C4 中的值的代码,如下所示:

  1. For each value in C3, find the same value in C1.对于 C3 中的每个值,在 C1 中找到相同的值。
  2. Replace the value in C3 with the value in C2 that occurs in the row with the C3/C1 match.将 C3 中的值替换为 C2 中与 C3/C1 匹配的行中出现的值。 In C3, For example, "2" (the first value) would be replaced with "t", "5" would be replaced with "j", "3" would be replaced with "3" and so forth.例如,在 C3 中,“2”(第一个值)将替换为“t”,“5”将替换为“j”,“3”将替换为“3”等等。
  3. Repeat the same procedure for values in C4.对 C4 中的值重复相同的过程。
  4. Skip any cells with an NA in C3 or C4.跳过 C3 或 C4 中具有 NA 的任何单元格。

The initial dataframe looks like this:最初的 dataframe 如下所示:

初始数据框

The final dataframe should look like this:最终的 dataframe 应如下所示:

更新的数据框

I've yet to come up with code (base R or Dplyr) that will accomplish this task.我还没有想出可以完成这项任务的代码(基础 R 或 Dplyr)。 If anyone can lend assistance, I would really appreciate it.如果有人可以提供帮助,我将不胜感激。

Thanks!谢谢!

This is a new df that I've tried to manipulate with the code provided by respondents (eg, df[c("C3", "C4")] <- lapply(df[c("C3", "C4")], function(x) df$C2[match(x, df$C1)])).这是我尝试使用受访者提供的代码来操作的新 df(例如 df[c("C3", "C4")] <- lapply(df[c("C3", "C4") ],函数(x)df$C2[匹配(x,df$C1)]))。

I am returning all NA's for C3 C4 and cannot understand why.我要退回 C3 C4 的所有 NA,但不明白为什么。 There are matches between C3 and C1. C3 和 C1 之间存在匹配。

在此处输入图像描述

We can use match我们可以使用match

df[c("C3", "C4")] <- lapply(df[c("C3", "C4")], function(x) df$C2[match(x, df$C1)])

I also used match , but split it up into two different statements to make it more clear what was going on:我也使用了match ,但将其拆分为两个不同的语句,以便更清楚地了解发生了什么:

# Create sample data
C1<-c("1","2","3","4","5")
C2<-c("x", "t", "u", "r", "j")
C3<-c("2","5","3","1","4")
C4<-c("3","1","NA", "2","5")
df<-data.frame(C1,C2,C3,C4)

# Make replacements
df$C3_mod <- ifelse(is.na(df$C3), df$C3, df$C2[match(df$C3, df$C1)])
df$C4_mod <- ifelse(is.na(df$C4), df$C4, df$C2[match(df$C4, df$C1)])

# View results
df
#   C1 C2 C3 C4 C3_mod C4_mod
# 1  1  x  2  3      t      u
# 2  2  t  5  1      j      x
# 3  3  u  3 NA      u   <NA>
# 4  4  r  1  2      x      t
# 5  5  j  4  5      r      j

Using match with matrix.使用与矩阵match

cols <- c('C3', 'C4')
df[cols] <- df$C2[match(as.matrix(df[cols]), df$C1)]
df

#  C1 C2 C3   C4
#1  1  x  t    u
#2  2  t  j    x
#3  3  u  u <NA>
#4  4  r  x    t
#5  5  j  r    j

I solved the issue of my NA values.我解决了我的 NA 值的问题。 It turns out that I had whitespaces in the column values that I hadn't accounted for.事实证明,我没有考虑到列值中有空格。 Again, thanks to everyone for their responses.再次感谢大家的回复。 I learned a lot in the process.在这个过程中我学到了很多。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据第二个数据框列中的匹配替换“数据框列”中的值 - Replace Values in Dataframe Column based on match in second data frame columns 如何将任何数据框重塑为 2 列数据框,第一个是(重复的)列名,第二个是相应的值? - How to reshape any data-frame to a 2-columns data-frame, with the (repeated) column names in the first and the corresponding values in the second? 基于多个列值的数据帧的总行数 - Sum Rows of a Data-Frame Based on Multiple Column Values R根据另一个数据框的精确匹配替换列的值 - R replace values of a column based on exact match of another data frame 如何使用第二列的值满足R中第三列的条件来创建列? - How do I create a column using values of a second column that meet the conditions of a third in R? R data.table如何在多个二进制数据列中用列名替换正值 - R data.table how to replace positive values with column names across multiple binary data columns R数据表:使用条件列和另一列替换跨多列的行值子集 - R data table: replace subset of row values across multiple columns using conditional with another column 在R中,根据与第二个数据框中的值的近似数值匹配,创建/填充数据框中的一列 - In R, create/fill a column of a data frame based on an approximate numerical match to values in a second data frame R - 如何根据下一列中的值将数据框列中的值更改为 NA,对多对列执行? - R - How do I change values in a column of a data frame to NA based on the value in the next column, performed on many pairs of columns? R:如何使用来自利用其他多列的条件的值替换 dataframe 列中的 NA? - R: How do I replace NAs in a dataframe column with values from conditions leveraging other multiple columns?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM