简体   繁体   中英

Finding and replacing multiple items in R

As a newbie to R i am having to write all my find and replacement statements one line at time (see code below) Is it possible to do this in a more succinct way (ie one line only)

YP$gender <- replace(as.character(YP$gender), YP$gender == "Female", "F")   
YP$gender <- replace(as.character(YP$gender), YP$gender == "Male", "M")

If just two replacements use 'ifelse':

YP$gender <- ifelse(as.character(YP$gender) == "Female", "F", "M") 

Else I would use left_join:

# Data
df <- data.frame(value = sample(1:3,10, replace = TRUE),
                 gender = sample(c("male", "female", "x"), 10, prob = c(0.4,0.4,0.2), replace = TRUE))

# Creating replacements
replace <- data.frame(gender = c("male", "female"), gender_short = c("m", "f"))

# Making replacements
library(dplyr)
df <- left_join(df, replace)
df

       value gender gender_short
1      1 female            f
2      2 female            f
3      3      x         <NA>
4      2   male            m
5      3 female            f
6      3      x         <NA>
7      3 female            f
8      1      x         <NA>
9      3   male            m
10     3   male            m

You can use '%in%' for multiples comparations instead of '=='.

replace(as.character(YP$gender), YP$gender %in% c("Male","Female"), c("M", "F"))   

EDIT: Sorry this code won't work as I tought.

But you can use loops to solve it.

YP = c("a","b","b","a","c")

keys = c("a", "b", "c")
rep_value = c("A", "B", "C")

for(index in 1:length(keys) ) {
  sub_key = keys[index]
  sub_rep_value = rep_value[index]

  value_index = which(YP %in% sub_key)
  YP[value_index] = sub_rep_value
}

Depends how many targets and replacements you have. If you have a lot then the easiest way is probably to create a merge file with two columns, one with the target and one with the replacement with as many rows as unique elements. If that merge file is called df then the code would look something like:

library(dplyr)
YP <- YP %>%
  merge(df, by = "gender", all.x = T)

If there aren't too many unique instances to replace then instead of using nested ifelse statements, you could use case_when from dplyr . You can chain together the logic using pipes %>%

library(dplyr)
YP %>%
  mutate(gender = case_when(
    gender == "Female" ~ "F",
    gender == "Male"   ~ "M,
    TRUE               ~ gender
  ))

Looks like you have factor column, so we just need to change the labels, something like this:

YP$gender <- factor(YP$gender, labels = c("F", "M"))

Reproducible example:

x <- factor(c("Female", "Male", "Female"))
x
# [1] Female Male   Female
# Levels: Female Male

#Check the levels
levels(x)
# [1] "Female" "Male"  

# relabel
x <- factor(x, labels = c("F", "M"))
# [1] F M F
# Levels: F M
levels(x)
# [1] "F" "M"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM