简体   繁体   中英

How to precisely check for illegal characters of file paths?

I am developing a Shiny app, which generates folders and subfolders for each user and each of their experiments.

I wish to ensure that neither the user nor experiment names contain any illegal characters.

I define a character vector with every illegal character that I know of, however, there is a chance of human errors. Is there a more precise way of doing this?

dir <- "~/home/my_app_data"
usr <- "john"
exp <- "explosion`s"
path <- paste(dir, usr, exp, sep = "/")
illegal <- c(" ", ",", "`")

if (any(illegal %in% (strsplit(x = path, split = "") %>% unlist))) {
  stop( "Illegal characters used")
} else {
  dir.create(path, recursive = T)
}

Using grepl . pattern="\\W" finds non-word characters excluding underscore "_" .

FUN <- function(x) {
  if (grepl("\\W", x)) stop(sprintf("Illegal characters used in '%s'", x)) else x
}

FUN(usr)
# [1] "john"

FUN(exp)
# Error in FUN(exp) : Illegal characters used in 'explosion`s'

lapply(c(usr, exp), FUN)
# Error in FUN(X[[i]], ...) : Illegal characters used in 'explosion`s' 

FUN("john123")
# [1] "john123"

FUN("john_123")
# [1] "john_123"

(Of course you want to define your custom else condition.)

dir <- "~/home/my_app_data"
usr <- "john"
exp <- "explosion`s"
path <- paste(dir, usr, exp, sep = "/")

Should prevent most errors:

if(!(identical(gsub(
  "[^[:alnum:]]+",
  "_",
  iconv(exp, from = "ascii", "utf-8")
), exp))) {
  stop("Illegal characters used")
} else {
  dir.create(path, recursive = TRUE)
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM