简体   繁体   English

检查字符串是否可以是Linux和Windows上FS的路径

[英]Check if string could be a path in FS on linux and windows

I'm trying to write a bash script in which i have to read a string input by user. 我正在尝试编写一个bash脚本,我必须读取用户输入的字符串。 I need this string to be allowed to be appropriate path (branch of parent directories and the target file or directory) that can be appropriately accessed on both linux and windows. 我需要允许这个字符串是适当的路径(父目录的分支和目标文件或目录),可以在linux和windows上适当地访问。

It's about directories' names and files' names. 这是关于目录的名称和文件名称。 I need them to fit both linux and windows requirements. 我需要它们以适应linux和windows的要求。 I know that using bash 4.x (maybe 4 is not determinative) in linux allows to create file with whatever name that contain whatever characters, but i can have problems accessing such file. 我知道在linux中使用bash 4.x(也许4不是决定性的)允许创建包含任何字符的任何名称的文件,但是我可能在访问这样的文件时遇到问题。

So far, i know that: 到目前为止,我知道:

  • unlike linux, windows can't access file that has colon in its name 与Linux不同,Windows无法访问名称中包含冒号的文件
  • unlike windows, there could be problems in linux accessing file that has exclamation sign in its name 与windows不同,linux访问文件的名称中可能存在感叹号
  • windows doesn't allow name that contains only spaces Windows不允许只包含空格的名称
  • both linux and windows doesn't allow "." Linux和Windows都不允许“。” and ".." names 和“......”的名字
  • windows doesn't allow name that contains only dots Windows不允许只包含点的名称

etc. 等等

Is there, say POSIX standard or some rules or something that fits both linux and windows requirements? 有没有,比如说POSIX标准或某些适合linux和windows要求的规则或东西? Which characters are allowed on both and what are exceptions for any? 两者中允许哪些字符以及哪些字符都是例外?

Also, i'm having trouble to check if a string is a path that fits. 此外,我无法检查字符串是否适合的路径。 I supposed that i can use alphanumeral characters, underscore, hyphen, round brackets, tilde, spaces, dots. 我想我可以使用字母数字,下划线,连字符,圆括号,波浪线,空格,圆点。 I also assume that the path should start with slash and not to end with slash. 我还假设路径应该以斜线开头而不是以斜线结尾。

I tried regexs like these and they are not working as i want them to: 我试过像这样的正则表达式,他们不能正常工作,因为我希望他们:

[[ ! "$path" == *['!'@#\$%^\&*+]* ]]
[[ "$path" == [a-zA-z0-9_.\ \(\)~\/-]* ]]
[[ "$path" =~ ^[a-zA-z0-9_\ -]+$ ]]

I just don't get all peculiarities of bash regex. 我只是没有得到bash正则表达式的所有特性。

So, what are requirements and what is the better way to verify them? 那么,什么是要求以及验证它们的更好方法是什么?

I would write a whitelist script that accepts the smallest common denominator in terms of the path name for Windows and Unix environments, but I guess that one has to distinguish between the Windows and Unix world when it comes to file prefixes and delimiters. 我会编写一个白名单脚本,它接受Windows和Unix环境路径名称的最小公分母,但我想在文件前缀和分隔符方面必须区分Windows和Unix世界。

The following script might be useful as a starting point. 以下脚本可能是一个有用的起点。 You could pass a path to the script as first parameter, and it returns OK when the path is okay (ie it satisfies the regular expression) or it returns NOK when path does not satisfy the regular expression. 您可以将路径作为第一个参数传递给脚本,并在路径正常时返回OK(即它满足正则表达式),或者当path不满足正则表达式时返回NOK。

For regular expression matching, I've used egrep in the script (option -x means that a given string has to match the whole string). 对于正则表达式匹配,我在脚本中使用了egrep(选项-x表示给定的字符串必须匹配整个字符串)。 $? denotes the return value of egrep --- if zero, the parameter path successfully matched the regular expression. 表示egrep的返回值---如果为零,则参数path成功匹配正则表达式。

Best, Julian 最好的,朱利安

#!/bin/bash

DELIM="/"
FILE="[a-zA-Z]([a-zA-Z0-9])*"
R="(${DELIM})?${FILE}(${DELIM}${FILE})*${DELIM}?"

path=$1

echo "$path" | egrep -x "$R"

[ $? -eq 0 ] && {
    echo "OK"
    exit 0
}

echo "NOK"
exit 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM