wrangling data using r

Question

I need your kind help tidying data using R.

My original data looks like this:

   > dput(mydata)
structure(list(subject = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L), .Label = c("N1", "E1"), class = "factor"), item_number = c(1, 
2, 1, 7, 1, 2, 2, 10), block = c(1, 1, 3, 3, 1, 1, 3, 3), condition = c("L", 
"L", "EI", "I", "L", "L", "EI", "I")), row.names = c(NA, 8L), class = "data.frame")


 > mydata
  subject item_number block condition
1      N1           1     1         L
2      N1           2     1         L
3      N1           1     3        EI
4      N1           7     3         I
5      E1           1     1         L
6      E1           2     1         L
7      E1           2     3        EI
8      E1          10     3         I

For some programming error, I could not label conditions in block 1 correctly. So, I am trying to adjust that by renaming condition in block 1 for different subjects and for different item numbers. Ideally, any item_number in block 1 that is given the value L for condition should be renamed based on the condition label given to the same item_number in block 3. For example, for the subject N1, if the item_number 1 exists in block 3 and is given the label EI for condition, then, the condition label for item_number 1 in block 1 should be set to the same label which is 'EI'. If the item_number 2 does not exist in block 3 for subject N1, then the condition label for item number 2 in block 1 should be 'E'.

The desired output should look like this:

dput(mydata_cleaned)
structure(list(subject = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L), .Label = c("N1", "E1"), class = "factor"), item_number = c(1, 
2, 1, 7, 1, 2, 2, 10), block = c(1, 1, 3, 3, 1, 1, 3, 3), condition = c("EI", 
"E", "EI", "I", "E", "EI", "EI", "I")), row.names = c(NA, 8L), class = "data.frame")

> mydata_cleaned
  subject item_number block condition
1      N1           1     1        EI
2      N1           2     1         E
3      N1           1     3        EI
4      N1           7     3         I
5      E1           1     1         E
6      E1           2     1        EI
7      E1           2     3        EI
8      E1          10     3         I

Any help is greatly appreciated.

Answer 1

An option is to reshape to 'wide' format with column names created from 'block', then do the replacement on the column 1 based on values of 3 and reshape back to 'long' format

library(dplyr)
library(tidyr)
mydata %>%
 pivot_wider(names_from = block, values_from = condition) %>% 
 mutate(`1` = case_when(`3` %in% "EI" & `1` %in% "L"  ~ `3`, 
       is.na(`3`) ~ 'E', TRUE ~ `1`)) %>%
 pivot_longer(cols = c(`1`, `3`), names_to = 'block',
          values_to = 'condition', values_drop_na = TRUE)

-output

# A tibble: 8 x 4
#  subject item_number block condition
#  <fct>         <dbl> <chr> <chr>    
#1 N1                1 1     EI       
#2 N1                1 3     EI       
#3 N1                2 1     E        
#4 N1                7 3     I        
#5 E1                1 1     E        
#6 E1                2 1     EI       
#7 E1                2 3     EI       
#8 E1               10 3     I

wrangling data using r

Question

1 answers

solution1
1 ACCPTED 2020-12-29 16:36:57

wrangling data using r

Question

1 answers

solution1 1 ACCPTED 2020-12-29 16:36:57

solution1
1 ACCPTED 2020-12-29 16:36:57