I have 88 tab separated files that I need to import into R.
They are named "Study-1-12"
The data in each one looks like
START: dd.mm.yyy hh:mm:ss
WAITING 3780 ms REACTION 1230 ms
WAITING 9700 ms REACTION 377 ms
WAITING 5538 ms REACTION 310 ms
WAITING 4599 ms REACTION 361 ms
WAITING 9579 ms REACTION 338 ms
END: dd.mm.yyy hh:mm:ss
So far I imported all of them into a list and summarised each one, so the end results is a table with two columns "waiting" and "reaction" both with a single mean value.
# Load filepaths and names
filepath <- list.files(path = "rawdata/", pattern = "*.dat", all.files = TRUE, full.names = TRUE) # Load full path
filenames <- list.files(path = "rawdata/", pattern = "*.dat", all.files = TRUE, full.names = FALSE) # load names of files
# load all files into list with named col headers
ldf <- lapply(filepath, function(x) read_tsv(file = x, skip = 1,
col_names = c("waiting", "valueW", "ms", "ws", "reaction", "valueR", "ms1")))
names(ldf) <- filenames # rename items in list
# select only relevant cols and do the math
ldf <- lapply(ldf, function(x) x %>%
select(waiting, valueW, reaction, valueR) %>%
filter(waiting == "WAITING") %>%
summarise(waiting = mean(valueW), reaction = mean(valueR))
)
Now what I would like to do is create a data frame with columns based on the file name (as above: study-1-12):
Any way of doing this in R?
library(purrr)
library(stringi)
fils <- list.files("~/Data/so", full.names=TRUE)
fils
## [1] "/Some/path/to/data/studyA-1-12" "/Some/path/to/data/studyB-30-31"
map_df(fils, function(x) {
stri_match_all_regex(x, "([[:alnum:]]+)-([[:digit:]]+)-([[:digit:]])([[:digit:]])")[[1]] %>%
as.list() %>%
.[2:5] %>%
set_names(c("study_name", "subject_id", "experiment_day", "trial")) -> meta
readLines(x) %>%
grep("WAITING", ., value=TRUE) %>%
map(~scan(text=., quiet=TRUE,
what=list(character(), double(), character(),
character(), double(), character()))[c(2,5)]) %>%
map_df(~set_names(as.list(.), c("waiting", "reaction"))) -> df
df$study_name <- meta$study_name
df$subject_id <- meta$subject_id
df$experiment_day <- meta$experiment_day
df$trial <- meta$trial
df
})
## # A tibble: 10 × 6
## waiting reaction study_name subject_id experiment_day trial
## <dbl> <dbl> <chr> <chr> <chr> <chr>
## 1 3780 1230 studyA 1 1 2
## 2 9700 377 studyA 1 1 2
## 3 5538 310 studyA 1 1 2
## 4 4599 361 studyA 1 1 2
## 5 9579 338 studyA 1 1 2
## 6 3780 1230 studyB 30 3 1
## 7 9700 377 studyB 30 3 1
## 8 5538 310 studyB 30 3 1
## 9 4599 361 studyB 30 3 1
## 10 9579 338 studyB 30 3 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.