简体   繁体   中英

Check if string could be a path in FS on linux and windows

I'm trying to write a bash script in which i have to read a string input by user. I need this string to be allowed to be appropriate path (branch of parent directories and the target file or directory) that can be appropriately accessed on both linux and windows.

It's about directories' names and files' names. I need them to fit both linux and windows requirements. I know that using bash 4.x (maybe 4 is not determinative) in linux allows to create file with whatever name that contain whatever characters, but i can have problems accessing such file.

So far, i know that:

  • unlike linux, windows can't access file that has colon in its name
  • unlike windows, there could be problems in linux accessing file that has exclamation sign in its name
  • windows doesn't allow name that contains only spaces
  • both linux and windows doesn't allow "." and ".." names
  • windows doesn't allow name that contains only dots

etc.

Is there, say POSIX standard or some rules or something that fits both linux and windows requirements? Which characters are allowed on both and what are exceptions for any?

Also, i'm having trouble to check if a string is a path that fits. I supposed that i can use alphanumeral characters, underscore, hyphen, round brackets, tilde, spaces, dots. I also assume that the path should start with slash and not to end with slash.

I tried regexs like these and they are not working as i want them to:

[[ ! "$path" == *['!'@#\$%^\&*+]* ]]
[[ "$path" == [a-zA-z0-9_.\ \(\)~\/-]* ]]
[[ "$path" =~ ^[a-zA-z0-9_\ -]+$ ]]

I just don't get all peculiarities of bash regex.

So, what are requirements and what is the better way to verify them?

I would write a whitelist script that accepts the smallest common denominator in terms of the path name for Windows and Unix environments, but I guess that one has to distinguish between the Windows and Unix world when it comes to file prefixes and delimiters.

The following script might be useful as a starting point. You could pass a path to the script as first parameter, and it returns OK when the path is okay (ie it satisfies the regular expression) or it returns NOK when path does not satisfy the regular expression.

For regular expression matching, I've used egrep in the script (option -x means that a given string has to match the whole string). $? denotes the return value of egrep --- if zero, the parameter path successfully matched the regular expression.

Best, Julian

#!/bin/bash

DELIM="/"
FILE="[a-zA-Z]([a-zA-Z0-9])*"
R="(${DELIM})?${FILE}(${DELIM}${FILE})*${DELIM}?"

path=$1

echo "$path" | egrep -x "$R"

[ $? -eq 0 ] && {
    echo "OK"
    exit 0
}

echo "NOK"
exit 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM