简体   繁体   English

在Java中拆分命令行

[英]Splitting a command line in Java

What's the recommended way to parse a shell-like command line in Java. 在Java中解析类似shell的命令行的推荐方法是什么。 By that I don't mean processing the options when they are already in array form (eg handling "-x" and such), there are loads of questions and answers about that already. 我并不是说当它们已经是数组形式时处理选项(例如处理“-x”等),已经有很多问题和答案。

No, I mean the splitting of a full command string into "tokens". 不,我的意思是将完整的命令字符串拆分为“令牌”。 I need to convert a string such as: 我需要转换一个字符串,如:

user 123712378 suspend "They are \"bad guys\"" Or\ are\ they?

...to the list/array: ...到列表/数组:

user
123712378
suspend
They are "bad guys"
Or are they?

I'm currently just doing a split on whitespace, but that obviously can't handle the quotes and escaped spaces. 我目前只是在空白上进行拆分,但显然无法处理引号和转义空格。

(Quote handling is most important. Escaped spaces would be nice-to-have) (引用处理是最重要的。逃脱的空间会很棒)

Note: My command string is the input from a shell-like web interface. 注意:我的命令字符串是来自类似shell的Web界面的输入。 It's not built from main(String[] args) 它不是从main(String[] args)构建的

What you would need is to implement a finite automaton. 你需要的是实现有限自动机。 You would need to read the string character by character and find the next state depending on your next or previous character. 您需要逐个字符地读取字符串,并根据您的下一个或上一个字符找到下一个状态。
For example a " indicates start of a string but if it is preceded by an \\ leaves the current state unchanged and reads until the next token that takes you to the next state. 例如, "指示字符串的开头,但如果它前面有一个\\保持当前状态不变,并读取直到下一个将您带到下一个状态的令牌。
Ie essentially in your example you would have 即基本上在你的例子中你会有

read string -> read number   
      ^  -    -   -  |  

You of course would need to define all the states and the special characters that affect or not affect your state. 您当然需要定义影响或不影响您的州的所有州和特殊字符。
To be honest I am not sure why you would want to provide such functionality to the end user. 说实话,我不确定你为什么要为最终用户提供这样的功能。
Traditionally all the cli programs accept input in a standard format -x or --x or --x=s etc. 传统上所有的cli程序都接受标准格式的输入-x or --x or --x=s等。
This format is well known to a typical user and is simple to implement and test as correct. 这种格式对于典型用户来说是众所周知的,并且易于实现和测试正确。
Traditionally if we are required to provide more "flexible" input for the user, it is best to build a GUI. 传统上,如果我们需要为用户提供更多“灵活”的输入,最好构建一个GUI。 That is what I would suggest. 这就是我的建议。

ArgumentTokenizer from DrJava parses command line in a way Bourne shell and its derivatives do . DrJava的ArgumentTokenizer以Bourne shell及其衍生产品的方式解析命令行

It properly supports escapes, so bash -c 'echo "\\"escaped '\\''single'\\'' quote\\""' gets tokenized into [bash, -c, echo "\\"escaped 'single' quote\\""] . 它正确地支持转义,所以bash -c 'echo "\\"escaped '\\''single'\\'' quote\\""'被标记为[bash, -c, echo "\\"escaped 'single' quote\\""]

Build the args[] back into a string, then tokenize using regexp: 将args []重新构建为字符串,然后使用regexp进行标记:

public static void main(String[] args) {
    String commandline = "";
    for(String arg : args) {
        commandline += arg;
        commandline += " ";
    }
    System.out.println(commandline);

    List<String> list = new ArrayList<String>();
    Matcher m = Pattern.compile("([^\"]\\S*|\".+?\")\\s*").matcher(commandline);
    while (m.find())
        list.add(m.group(1)); // Add .replace("\"", "") to remove surrounding quotes.


    System.out.println(list);
}

The latter part I took from here . 后一部分我从这里开始

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM