Aim
I have hundreds of Excel Workbooks. There is a sheet in most of them, say, "Correct Sheet". Those sheets should all have 10 columns. I want to combine them all into one consolidated dataset (the Final Dataset).
However, some Workbooks do not have the Correct Sheet and when loading them, I get an error. Where I do have Correct Sheets, some of them do not have 10 columns. I want to exclude any Workbook that does not have a Correct Sheet or where the Correct Sheet does not have 10 columns.
Attempt
Let's say that a correct sheet is a numerical variable. Therefore, of the two variables below, only "b" is correct as it has a numerical value. a = "hello" b = 8
Reprex
a <- "hello"
b <- 8
#A: Check that a + 2 does not produce an error
if(tryCatch({a + 2}, error = function(e) {"error"}) != "error") {
#A1:If a+2 does not produce an error, then check that the sum is right
if(a+2 == 10) {
print("a No error and produced correct answer - Add 'a' to consolidated list")
} else {
#B: Check that b + 2 does not produce an error
if(tryCatch({b+2}, error = function(e) {"error"}) != "error") {
#B1:If b+2 does not produce an error, then check that the sum is right
if(b+2 == 10) {
print("a No error but produced wrong answer,
b No error and produced correct answer,
Add 'b' to consolidated list")
} else {
print("a No error but produced wrong answer, b No error but produced wrong answer")
}
#B2: If b+2 produced an error
} else {print("a No error but produced wrong answer, b Error")}
}
#A2:If a+2 produces an error, then go straight to b
} else {
#B: Check that b + 2 does not produce an error
if(tryCatch({b+2}, error = function(e) {"error"}) != "error") {
#B1:If b+2 does not produce an error, then check that the sum is right
if(b+2 == 10) {
print("a Error,
b No error and produced correct answer,
Add 'b' to consolidated list")
} else {
print("a Error, b No error but produced wrong answer")
}
#B2: If b+2 produced an error
} else {print("a Error, b Error")}
}
Problem
I actually have three variables (eg a = "hello", b = 8, and c = "goodbye", which adds to the above complexity.
Is there a simpler way of doing this?
You could try purrr::safely()
. Here's an analogue example using a nested list of “Excel workbooks”:
library(purrr)
# example "workbooks"
wkbks <- list(
w1 = list(
sheet1 = 1:3,
`Correct Sheet` = 1:10
),
w2 = list(
sheet1 = 1:3
),
w3 = list(
`Correct Sheet` = 1:8
),
w4 = list(
`Correct Sheet` = 1:10,
sheet2 = 1:5
)
)
# helper function to read wkbk and throw error conditions not met
read_correct <- function(x) {
stopifnot(length(x[["Correct Sheet"]]) == 10)
x[["Correct Sheet"]]
}
# iterate over wkbks, wrapping `read_correct` in `safely()`
correct_sheets <- wkbks |>
map(safely(read_correct)) |>
transpose()
Results:
# see all errors, removing NULLs
#> compact(correct_sheets$error)
$w2
<simpleError in .f(...): length(x[["Correct Sheet"]]) == 10 is not TRUE>
$w3
<simpleError in .f(...): length(x[["Correct Sheet"]]) == 10 is not TRUE>
# see all correct results, removing NULLs
#> compact(correct_sheets$result)
$w1
[1] 1 2 3 4 5 6 7 8 9 10
$w4
[1] 1 2 3 4 5 6 7 8 9 10
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.