[英]Splitting a command line in Java
What's the recommended way to parse a shell-like command line in Java. 在Java中解析类似shell的命令行的推荐方法是什么。 By that I don't mean processing the options when they are already in array form (eg handling "-x" and such), there are loads of questions and answers about that already.
我并不是说当它们已经是数组形式时处理选项(例如处理“-x”等),已经有很多问题和答案。
No, I mean the splitting of a full command string into "tokens". 不,我的意思是将完整的命令字符串拆分为“令牌”。 I need to convert a string such as:
我需要转换一个字符串,如:
user 123712378 suspend "They are \"bad guys\"" Or\ are\ they?
...to the list/array: ...到列表/数组:
user
123712378
suspend
They are "bad guys"
Or are they?
I'm currently just doing a split on whitespace, but that obviously can't handle the quotes and escaped spaces. 我目前只是在空白上进行拆分,但显然无法处理引号和转义空格。
(Quote handling is most important. Escaped spaces would be nice-to-have) (引用处理是最重要的。逃脱的空间会很棒)
Note: My command string is the input from a shell-like web interface. 注意:我的命令字符串是来自类似shell的Web界面的输入。 It's not built from
main(String[] args)
它不是从
main(String[] args)
构建的
What you would need is to implement a finite automaton. 你需要的是实现有限自动机。 You would need to read the string character by character and find the next state depending on your next or previous character.
您需要逐个字符地读取字符串,并根据您的下一个或上一个字符找到下一个状态。
For example a "
indicates start of a string but if it is preceded by an \\
leaves the current state unchanged and reads until the next token that takes you to the next state. 例如,
"
指示字符串的开头,但如果它前面有一个\\
保持当前状态不变,并读取直到下一个将您带到下一个状态的令牌。
Ie essentially in your example you would have 即基本上在你的例子中你会有
read string -> read number
^ - - - |
You of course would need to define all the states and the special characters that affect or not affect your state. 您当然需要定义影响或不影响您的州的所有州和特殊字符。
To be honest I am not sure why you would want to provide such functionality to the end user. 说实话,我不确定你为什么要为最终用户提供这样的功能。
Traditionally all the cli programs accept input in a standard format -x or --x or --x=s
etc. 传统上所有的cli程序都接受标准格式的输入
-x or --x or --x=s
等。
This format is well known to a typical user and is simple to implement and test as correct. 这种格式对于典型用户来说是众所周知的,并且易于实现和测试正确。
Traditionally if we are required to provide more "flexible" input for the user, it is best to build a GUI. 传统上,如果我们需要为用户提供更多“灵活”的输入,最好构建一个GUI。 That is what I would suggest.
这就是我的建议。
ArgumentTokenizer from DrJava parses command line in a way Bourne shell and its derivatives do . DrJava的ArgumentTokenizer以Bourne shell及其衍生产品的方式解析命令行 。
It properly supports escapes, so bash -c 'echo "\\"escaped '\\''single'\\'' quote\\""'
gets tokenized into [bash, -c, echo "\\"escaped 'single' quote\\""]
. 它正确地支持转义,所以
bash -c 'echo "\\"escaped '\\''single'\\'' quote\\""'
被标记为[bash, -c, echo "\\"escaped 'single' quote\\""]
。
Build the args[] back into a string, then tokenize using regexp: 将args []重新构建为字符串,然后使用regexp进行标记:
public static void main(String[] args) {
String commandline = "";
for(String arg : args) {
commandline += arg;
commandline += " ";
}
System.out.println(commandline);
List<String> list = new ArrayList<String>();
Matcher m = Pattern.compile("([^\"]\\S*|\".+?\")\\s*").matcher(commandline);
while (m.find())
list.add(m.group(1)); // Add .replace("\"", "") to remove surrounding quotes.
System.out.println(list);
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.