简体   繁体   English

在 R 中,如何根据条件将列中的值替换为另一个数据集的另一列的值?

[英]In R, how to replace values in a column with values of another column of another data set based on a condition?

I have to data sets, samples of which I've given below.我必须数据集,我在下面给出的样本。 I need to replace project names in target_df$project_name , in case they are present in registry_df$to_change with corresponding values in registry_df$replacement .我需要替换target_df$project_name中的项目名称,以防它们出现在registry_df$to_change中,并使用registry_df$replacement $replacement 中的相应值。 However, the code I tried, obviously, did not deliver any result.但是,我尝试的代码显然没有提供任何结果。 How should it be corrected or what other way there is to achieve the desired goal?应该如何纠正或有什么其他方式来实现预期的目标?

Data sets:数据集:

target_df <- tibble::tribble(
  ~project_name,     ~sum,   
  "Mark",            "4307",     
  "Boat",            "9567",       
  "Delorean",        "5344",      
  "Parix",           "1043",
)

registry_df <- tibble::tribble(
  ~to_change,     ~replacement,   
  "Mark",            "Duck",     
  "Boat",            "Tank",       
  "Toloune",         "Bordeaux",      
  "Hunge",           "Juron",
)

Desired output of target_df: target_df 的所需 output:

project_name        sum   
  "Duck"            "4307"     
  "Tank"            "9567"       
  "Delorean"        "5344"      
  "Parix"           "1043"

Code:代码:

library(data.table)

target_df <- transform(target_df, 
                       project_name = ifelse(target_df$project_name %in% registry_df$to_change),
                       registry_df$replacement,
                       project_name
)

A dplyr solution. dplyr解决方案。 There's probably an elegant way with less steps.可能有一种优雅的方式,步骤更少。

library(dplyr)

target_df <- target_df %>% 
  left_join(registry_df,  
            by = c("project_name" = "to_change")) %>% 
  mutate(replacement = ifelse(is.na(replacement), project_name, replacement)) %>% 
  select(project_name = replacement, sum)

Result:结果:

# A tibble: 4 × 2
  project_name sum  
  <chr>        <chr>
1 Duck         4307 
2 Tank         9567 
3 Delorean     5344 
4 Parix        1043

A base R solution: You can match the columns using the match function.基本 R 解决方案:您可以使用match function 匹配列。 Since not all levels of target_df$project_name are in registry_df$to_change your matching variable will have NA s.由于并非所有级别的target_df$project_name都在registry_df$to_change中,因此您的匹配变量将具有NA Therefor, I included the ifelse function which in case of NA s keeps original values.因此,我包括了ifelse function 在NA s 的情况下保持原始值。

matching <- registry_df$replacement[match(target_df$project_name, registry_df$to_change)]
target_df$project_name <- ifelse(is.na(matching),
                                 target_df$project_name,
                                 matching)

target_df gives expected output: target_df给出了预期的 output:

  project_name sum  
  <chr>        <chr>
1 Duck         4307 
2 Tank         9567 
3 Delorean     5344 
4 Parix        1043 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM