I need your kind help tidying data using R.
My original data looks like this:
> dput(mydata)
structure(list(subject = structure(c(1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L), .Label = c("N1", "E1"), class = "factor"), item_number = c(1,
2, 1, 7, 1, 2, 2, 10), block = c(1, 1, 3, 3, 1, 1, 3, 3), condition = c("L",
"L", "EI", "I", "L", "L", "EI", "I")), row.names = c(NA, 8L), class = "data.frame")
> mydata
subject item_number block condition
1 N1 1 1 L
2 N1 2 1 L
3 N1 1 3 EI
4 N1 7 3 I
5 E1 1 1 L
6 E1 2 1 L
7 E1 2 3 EI
8 E1 10 3 I
For some programming error, I could not label conditions in block 1 correctly. So, I am trying to adjust that by renaming condition in block 1 for different subjects and for different item numbers. Ideally, any item_number in block 1 that is given the value L for condition should be renamed based on the condition label given to the same item_number in block 3. For example, for the subject N1, if the item_number 1 exists in block 3 and is given the label EI for condition, then, the condition label for item_number 1 in block 1 should be set to the same label which is 'EI'. If the item_number 2 does not exist in block 3 for subject N1, then the condition label for item number 2 in block 1 should be 'E'.
The desired output should look like this:
dput(mydata_cleaned)
structure(list(subject = structure(c(1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L), .Label = c("N1", "E1"), class = "factor"), item_number = c(1,
2, 1, 7, 1, 2, 2, 10), block = c(1, 1, 3, 3, 1, 1, 3, 3), condition = c("EI",
"E", "EI", "I", "E", "EI", "EI", "I")), row.names = c(NA, 8L), class = "data.frame")
> mydata_cleaned
subject item_number block condition
1 N1 1 1 EI
2 N1 2 1 E
3 N1 1 3 EI
4 N1 7 3 I
5 E1 1 1 E
6 E1 2 1 EI
7 E1 2 3 EI
8 E1 10 3 I
Any help is greatly appreciated.
An option is to reshape to 'wide' format with column names created from 'block', then do the replacement on the column 1
based on values of 3
and reshape back to 'long' format
library(dplyr)
library(tidyr)
mydata %>%
pivot_wider(names_from = block, values_from = condition) %>%
mutate(`1` = case_when(`3` %in% "EI" & `1` %in% "L" ~ `3`,
is.na(`3`) ~ 'E', TRUE ~ `1`)) %>%
pivot_longer(cols = c(`1`, `3`), names_to = 'block',
values_to = 'condition', values_drop_na = TRUE)
-output
# A tibble: 8 x 4
# subject item_number block condition
# <fct> <dbl> <chr> <chr>
#1 N1 1 1 EI
#2 N1 1 3 EI
#3 N1 2 1 E
#4 N1 7 3 I
#5 E1 1 1 E
#6 E1 2 1 EI
#7 E1 2 3 EI
#8 E1 10 3 I
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.