简体   繁体   中英

Create new levels in factors by adding existing factors: dplyr

I have a dataset where I have self-report measure for students for different subscales (factor which contains some levels). I want to add new factor levels for each participant.

# A tibble: 12 x 3
   first_name subscales          value
   <chr>      <fct>              <int>
 1 P1         Emotion Regulation     5
 2 P1         Empathy                7
 3 P1         Family Support        10
 4 P1         Gratitude             12
 5 P1         Optimism              12
 6 P1         Peer Support           9
 7 P1         Persistence            5
 8 P1         School Support         8
 9 P1         Self-Awareness         7
10 P1         Self-Control           6
11 P1         Self-Efficacy          8
12 P1         Zest                  12

#dput 

structure(list(first_name = c("P1", "P1", "P1", "P1", "P1", "P1", 
"P1", "P1", "P1", "P1", "P1", "P1"), subscales = structure(1:12, .Label = c("Emotion Regulation", 
"Empathy", "Family Support", "Gratitude", "Optimism", "Peer Support", 
"Persistence", "School Support", "Self-Awareness", "Self-Control", 
"Self-Efficacy", "Zest"), class = "factor"), value = c(5L, 7L, 
10L, 12L, 12L, 9L, 5L, 8L, 7L, 6L, 8L, 12L)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -12L))

Let's say I want to add new factor levels for each participant such that:

Engaged Living = Optimism + Zest + Gratitude

Emotional Competence = Emotion Regulation + Self-Control + Empathy ,

My current workflow is to convert the df from long to wide and then back to long ( pivot_longer and pivot_wider ). This get's the job done but I'm wondering if there is another workflow that avoids doing this (ie, keep the df in long format ). I'm looking for a tidyverse/dplyr workflow.

It seems you want to add rows, not just factor levels. One way would be to create the new summary rows and then bind that back to the original data. For example

library(dplyr)
dd %>% 
  mutate(subscales = case_when(
    subscales %in% c("Optimism", "Zest", "Gratitude") ~ "Engaged Living",
    subscales %in% c("Emotion Regulation", "Self-Control", "Empathy") ~ "Emotional Competence"
  )) %>% 
  group_by(first_name, subscales) %>% 
  filter(!is.na(subscales)) %>% 
  summarize(value=sum(value)) %>% 
  bind_rows(dd)

which gives

# A tibble: 14 × 3
# Groups:   first_name [1]
   first_name subscales            value
   <chr>      <chr>                <int>
 1 P1         Emotional Competence    18
 2 P1         Engaged Living          36
 3 P1         Emotion Regulation       5
 4 P1         Empathy                  7
 5 P1         Family Support          10
 6 P1         Gratitude               12
 7 P1         Optimism                12
 8 P1         Peer Support             9
 9 P1         Persistence              5
10 P1         School Support           8
11 P1         Self-Awareness           7
12 P1         Self-Control             6
13 P1         Self-Efficacy            8
14 P1         Zest                    12

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM