简体   繁体   中英

Replace values in entire data frame with column values in R

I did some search but cannot find an obvious answer of this question so hopefully it's not a duplicated question. I have a data frame looks like this:

X1 X2 V1 V2 V3 ... Vn
A  B  0  1  2      1
B  C  1  0  1      0
A  C  2  1  0      1 

What I want to achieve is to replace V1 to Vn values to the "dosage" of X2. So for row 1 (each row may have different values of X1 and X2),

  • if the value is 0, I want to replace it to AA;
  • if the value is 1, I want to replace it to AB;
  • if the value is 2, I want to replace it to BB;

The expected outcome is:

X1 X2 V1 V2 V3 ... Vn
A  B  AA AB BB     AB
B  C  BC BB BC     BB
A  C  CC AC AA     AC

Here is the sample data frame:


Thanks for the help!

This is inspired from @Matt's answer. We can use mutate_at with paste0 to achieve this task.

## Load packages

dat2 <- dat %>%
  mutate_at(vars(-X1, -X2), .funs = list(
      . == 0            ~paste0(X1, X1),
      . == 1            ~paste0(X1, X2),
      . == 2            ~paste0(X2, X2),
      TRUE              ~NA_character_
#   X1 X2 V1 V2 V3 Vn
# 1  A  B AA AB BB AB
# 2  B  C BC BB BC BB
# 3  A  C CC AC AA AC


dat <- read.table(text = "X1 X2 V1 V2 V3 Vn
A  B  0  1  2  1
B  C  1  0  1  0
A  C  2  1  0  1 ",
                  stringsAsFactors = FALSE, header = TRUE)

With your actual df you can replace V1:V3 with V1:Vn .

Using your reprex, you can do:


df %>% 
      . == 0 ~ "AA",
      . == 1 ~ "AB",
      . == 2 ~ "BB"

It not an elegant solution but for the sake of completeness: just nest two for-loops

for (i in 1:dim(df)[1]) {
  for (j in 3:dim(df)[2]){
    if (df[i,j] == 0){
      df[i,j] <- paste0(df[i,1], df[i,1])
    } else if (df[i,j] == 1) {
      df[i,j] <- paste0(df[i,1], df[i,2])
    } else if (df[i,j] == 2) {
      df[i,j] <- paste0(df[i,2], df[i,2])

Sorry for that.

Use spread and gather

df <- tibble(X1=c("A","B","A"),

Capture your from:to translation

transl <- tibble(DOSE = c(0,1,2),
                 OUTCOME = c("AA", "AB", "AC"))

Then Gather your values into long form

longTbl <- df %>% 
  gather(key = "TheV", value = "DOSE", na.rm = TRUE,starts_with("V")) %>% 
  left_join(transl, by = "DOSE") %>% 
  select(- DOSE)

# A tibble: 9 x 4
  X1    X2    TheV  OUTCOME
  <chr> <chr> <chr> <chr>  
1 A     B     V1    AA     
2 B     C     V1    AB     
3 A     C     V1    AC     
4 A     B     V2    AB     
5 B     C     V2    AA     
6 A     C     V2    AB     
7 A     B     V3    AC     
8 B     C     V3    AB     
9 A     C     V3    AA 

You might be better leaving it there. But we can pivot it back with spread .

widTbl <- longTbl %>% 
  spread(TheV, OUTCOME )

# A tibble: 3 x 5
  X1    X2    V1    V2    V3   
  <chr> <chr> <chr> <chr> <chr>
1 A     B     AA    AB    AC   
2 A     C     AC    AB    AA   
3 B     C     AB    AA    AB   

And Bob's your uncle.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM