简体   繁体   English

将字符串表达式拆分为标记

[英]Splitting string expression into tokens

My input is like 我的意见就像

String str = "-1.33E+4-helloeeee+4+(5*2(10/2)5*10)/2";

i want the output as: 我希望输出为:

1.33E+4
helloeeee
4
5
2
10
2
5
10
2

But I am getting the output as 但是我得到了输出

1.33, 4, helloeeee, 4, 5, 2, 10, 2, 5, 10, 2

i want the exponent value completely after splitting "1.33e+4" 在分割“1.33e + 4”后我想完全指数值

here is my code: 这是我的代码:

    String str = "-1.33E+4-helloeeee+4+(5*2(10/2)5*10)/2";
    List<String> tokensOfExpression = new ArrayList<String>();
    String[] tokens=str.split("[(?!E)+*\\-/()]+");
    for(String token:tokens)
    {   
         System.out.println(token);
         tokensOfExpression.add(token);
    }
    if(tokensOfExpression.get(0).equals(""))
    {
         tokensOfExpression.remove(0);
    }

You can't do that with a single regular expression, because of the ambiguities introduced by FP constants in scientific notation, and in any case you need to know which token is which without having to re-scan them. 你不能用一个正则表达式来做到这一点,因为FP常量在科学记数法中引入了歧义,并且在任何情况下你都需要知道哪个令牌不需要重新扫描它们。 You've also mis-stated your requirement, as you certainly need the binary operators in the output as well. 你也错误地说明了你的要求,因为你当然也需要输出中的二元运算符。 You need to write both a scanner and a parser. 您需要同时编写扫描程序和解析程序。 Have a look for 'recursive descent expression parser' and 'Dijkstra shunting-yard algorithm'.Resetting the digest is redundant. 看看'递归下降表达式解析器'和'Dijkstra shunting-yard算法'。重新设置摘要是多余的。

I would first replace the E+ with a symbol that is not ambiguous such as 我首先用一个不含糊的符号替换E +,例如

str.ReplaceAll("E+","SCINOT");

You can then parse with StringTokenizer , replacing the SCINOT symbol when you need to evaluate the number represented in scientific notation. 然后,您可以使用StringTokenizer进行解析,在需要评估科学计数法表示的数字时替换SCINOT符号。

尝试这个

String[] tokens=str.split("(?<!E)+[*\\-/()+]");

It's easier to achieve the result with Matcher 使用Matcher更容易实现结果

    String str = "-1.33E+4-helloeeee+4+(5*2(10/2)5*10)/2";
    Matcher m = Pattern.compile("\\d+\\.\\d*E[+-]?\\d+|\\w+").matcher(str);
    while(m.find()) {
        System.out.println(m.group());
    }

prints 版画

1.33E+4
helloeeee
4
5
2
10
2
5
10
2

note that it needs some testing for different floating point expressions but it is easily adjustable 请注意,它需要对不同的浮点表达式进行一些测试,但它很容易调整

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM