简体   繁体   中英

How to recode data within the data frame In R

A sample of data that can be generalised to many columns are as follows

id  colour  zm          cinema  pen     wm          monitor mn
1   blue    good        a       wood    bad         24      very good
2   Yellow  bad         b       metal   good enough 23      good
3   Red     good enough d       plastic bad         27      good enough

I want to get the following table

id  colour  zm  cinema  pen     wm  monitor mn
1   blue    B   a       wood    D   34      A
2   Yellow  D   b       metal   C   23      B
3   Red     C   d       plastic D   27      C

very good= A, good= B, good enough=C, bad=D. I understand it can be done by "mutate", but struggle to do it within a data frame.

We can use named vector to change the values

nm1 <- setNames(LETTERS[1:4], c("very good", "good", "good enough", "bad"))
library(dplyr)
df2 <- df1 %>%
     mutate(across(c(zm, wm, mn), ~ nm1[.]))

-output

df2
#  id colour zm cinema     pen wm monitor mn
#1  1   blue  B      a    wood  D      24  A
#2  2 Yellow  D      b   metal  C      23  B
#3  3    Red  C      d plastic  D      27  C

data

df1 <- structure(list(id = 1:3, colour = c("blue", "Yellow", "Red"), 
    zm = c("good", "bad", "good enough"), cinema = c("a", "b", 
    "d"), pen = c("wood", "metal", "plastic"), wm = c("bad", 
    "good enough", "bad"), monitor = c(24L, 23L, 27L), mn = c("very good", 
    "good", "good enough")), class = "data.frame", row.names = c(NA, 
-3L))

A base R option using match

v <- c("very good", "good", "good enough", "bad")
cols <- c("zm", "wm", "mn")
df[cols] <- LETTERS[seq_along(v)][match(unlist(df[cols]), v)]

gives

> df
  id colour zm cinema     pen wm monitor mn
1  1   blue  B      a    wood  D      24  A
2  2 Yellow  D      b   metal  C      23  B
3  3    Red  C      d plastic  D      27  C

Data

> dput(df)
structure(list(id = 1:3, colour = c("blue", "Yellow", "Red"),
    zm = c("good", "bad", "good enough"), cinema = c("a", "b",
    "d"), pen = c("wood", "metal", "plastic"), wm = c("bad",
    "good enough", "bad"), monitor = c(24L, 23L, 27L), mn = c("very good",
    "good", "good enough")), class = "data.frame", row.names = c(NA,
-3L))

Another approach would be to use forcats::fct_recode :

vars <- c(A = "very good", B = "good", C = "good enough", D = "bad")

library(dplyr)
library(forcats)
data %>% 
  mutate(across(where(is.character),~fct_recode(.,!!!vars)))
  id colour zm cinema     pen wm monitor mn
1  1   blue  B      a    wood  D      24  A
2  2 Yellow  D      b   metal  C      23  B
3  3    Red  C      d plastic  D      27  C

Alternatively, you could use recode which makes for good legibility:

library(dplyr)

df1 <- df1 %>%
  mutate(
    zm = recode(zm, "very good" = "A", "good" = "B", "good enough" = "C", "bad" = "D"))

df1
#   id colour zm cinema     pen          wm monitor          mn
# 1  1   blue  B      a    wood         bad      24   very good
# 2  2 Yellow  D      b   metal good enough      23        good
# 3  3    Red  C      d plastic         bad      27 good enough

Edit: you would, of course, need library dplyr as mutate and recode are part of it.

Explanation: mutate() can add or change columns to your dataframe. recode() takes, as you expect it, zm and substitutes the values and mutate() assigns the result back to zm .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM