简体   繁体   中英

R dplyr/tidyr recode column values

I had multiple data sets that I merged into 1 dplyr dataframe with rbind.

GapAnalysis16 <- select(memSat16,
     importance_communication_website_content, 
     satisfaction_communication_website_content,
     status,
     Year2016) %>% 
     rename(ComImpt=importance_communication_website_content, 
     ComSat = satisfaction_communication_website_content,
     status = status,
     year = Year2016)


 GapAnalysis17July <- select(memSatJuly17, 
    importance_communication_website_content_JULY17,
    satisfaction_communication_website_content_JULY17, 
    role_primary_new_JULY17,Year2017_July) %>% 
    rename(ComImpt=importance_communication_website_content_JULY17, 
    ComSat = satisfaction_communication_website_content_JULY17,
    status = role_primary_new_JULY17,
    year = Year2017_July)


 GapAnalysis <- rbind(GapAnalysis17July,GapAnalysis16)

And got my new combined data set:

   ComImpt ComSat status year
1       4      2      1    1
2      NA     NA      1    1
3       4      5      5    1
4       3      3      5    1
5       6      6      5    1
6       5      5      1    1

I needed it in long form so converted it:

    GapAnalysis_LongForm <-  GapAnalysis %>%
    gather(key = Product,value = Score, ComSat, ComImpt)

And now have this:

    status  year Product Score
     <dbl> <dbl> <chr>   <dbl>
 1     1.    1. ComSat      2.
 2     5.    1. ComSat      5.
 3     5.    2. ComSat      3.
 4     1.    1. ComSat      5.
 5     1.    1. ComImpt     4.
 6     5.    1. ComSat      4.

I now need to recode ComSat and ComImpt to values ( 1 & 2) but am stumped. Recode and recode_factor are giving me errors. I'm trying to get output something like this:

    status  year Product Score
     <dbl> <dbl> <chr>   <dbl>
 1     1.    1. 1           2.
 2     5.    1. 1           5.
 3     5.    2. 1           3.
 4     1.    1. 1           5.
 5     1.    1. 2           4.
 6     5.    1. 1           4.

Any general points in the right direction?

I appreciate it!!!

I guess you are having some problems because you are using recode_factor outside of mutate . When you are modifying columns of a data frame, make sure that you use mutate (in the context of tidyverse ).

The following should work and do the same thing.


With the base factor function

df %>%
  mutate(Product = factor(Product, levels = c("ComSat", "ComImpt"), labels = c(1L, 2L)))

With recode_factor function

df %>%
  mutate(Product = recode_factor(Product, "ComSat" = 1L, "ComImpt" = 2L))

or

df3 <- df %>%
  mutate_at(vars(Product), ~recode_factor(.,"ComSat" = 1L, "ComImpt" = 2L))

If there are only 2 Product codes ( ComSat,ComImpt ) in your data.frame then simple ifelse will be easier to help.

You need to additional step in dplyr chain as: mutate(Product = ifelse(Product=="ComSat", 1L, 2L))

GapAnalysis_LongForm  <- GapAnalysis %>%
  gather(key = Product,value = Score, ComSat, ComImpt) %>%
  mutate(Product = ifelse(Product=="ComSat", 1L, 2L))

#    status year Product Score
# 1       1    1       1     2
# 2       1    1       1    NA
# 3       5    1       1     5
# 4       5    1       1     3
# 5       5    1       1     6
# 6       1    1       1     5
# 7       1    1       2     4
# 8       1    1       2    NA
# 9       5    1       2     4
# 10      5    1       2     3
# 11      5    1       2     6
# 12      1    1       2     5

Modifying the mutate_at approach of @hpesoj626:

Scoped verbs (_if, _at, _all) have been superseded by the use of across() in an existing verb according to tidyverse (see here for more info).

The following code should work:

df3 <- df %>%
  mutate(across(Product), ~recode_factor(.,"ComSat" = 1L, "ComImpt" = 2L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM