简体   繁体   English

从数学方程中提取变量

[英]Extracting variables from mathematical equation

I've a string like 我有一个像

a+(b * 6) <= cat*45 && cat = dog a +(b * 6)<=猫* 45 &&猫=狗

I am trying to extract the variables a, b, cat, dog . 我试图提取变量a, b, cat, dog Below is my code. 下面是我的代码。

        Set<String> varList = null; 
        StringBuilder sb = null; 
        String expression = "a+(b * 6) <= cat*45 && cat = dog";
        if (expression!=null)
        {
            sb = new StringBuilder(); 

            //list that will contain encountered words,numbers, and white space
            varList = new HashSet<String>();

            Pattern p = Pattern.compile("[A-Za-z\\s]");
            Matcher m = p.matcher(expression);

            //while matches are found 
            while (m.find())
            {
                //add words/variables found in the expression 
                sb.append(m.group());
            }//end while 

            //split the expression based on white space 
            String [] splitExpression = sb.toString().split("\\s");
            for (int i=0; i<splitExpression.length; i++)
            {
                varList.add(splitExpression[i]);
            }
        }

        Iterator iter = varList.iterator();
        while (iter.hasNext()) {
            System.out.println(iter.next());
        }

Output I'm getting is: 我得到的输出是:

ab
cat
dog

Required output: 要求的输出:

a
b
cat
dog

Here the case is, the variables may or may not be separated by white space. 在这种情况下,变量可能会或可能不会由空格分隔。 When there is white space, the output is good. 有空格时,输出良好。 but if the variables are not separated by white space, I'm getting wrong outputs. 但是如果变量没有用空格隔开,则输出错误。 Can someone suggest me the proper Pattern ? 有人可以建议我正确的Pattern吗?

Why use a regex find() loop to extract words, then concatenate them all into a string just to split that string again? 为什么要使用正则表达式find()循环提取单词,然后将它们全部串联成一个字符串以再次拆分该字符串?

Just use the words found by the regex. 只需使用正则表达式找到的单词即可。

Well, that is, after removing whitespace ( \\\\s ) from the expression and making it match entire words ( + ), of course. 好吧,就是说,从表达式中删除空格( \\\\s )并使其与整个单词( + )匹配之后,当然。

Pattern p = Pattern.compile("[A-Za-z]+");
Matcher m = p.matcher(expression);
while (m.find())
{
    varList.add(m.group());
}

If your variables are simply string of alphabets you can simply search for them using simple regex like this. 如果变量只是字母字符串,则可以使用简单的正则表达式像这样简单地搜索它们。

Regex: [A-Za-z]+ 正则表达式: [A-Za-z]+

Regex101 Demo Regex101演示

This regex should work ( variable name can start with uppercase or lowercase and can then contain digit(s), underscore, uppercase and lowercase ) 此正则表达式应该有效( variable name can start with uppercase or lowercase and can then contain digit(s), underscore, uppercase and lowercase

\b[A-Za-z]\w*\b

Regex Demo 正则表达式演示

Java Code Java代码

Set<String> set = new HashSet<String>();
String line = "a+(b * 6) <= cat*45 && cat = dog";
String pattern = "\\b([A-Za-z]\\w*)\\b";

Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);

while (m.find()) {
    set.add(m.group());
}
System.out.println(set);

Ideone Demo Ideone演示

I believe you should replace your regexp with "[A-Za-z]+". 我相信您应该用“ [A-Za-z] +”代替您的正则表达式。 I just simulated it in Python 我只是用Python模拟

>>> re.findall('[A-Za-z]+', 'a+(b * 6) <= cat*45 && cat = dog')
['a', 'b', 'cat', 'cat', 'dog']
>>>

So the next, put the result list into a set: 接下来,将结果列表放入集合中:

>>> rs = set(re.findall('[A-Za-z]+', 'a+(b * 6) <= cat*45 && cat = dog'))
>>> for w in rs:
...     print w,
...
a b dog cat
>>>

Fully working code 完整的工作代码

public static void main(String[] args) {
    Set<String> varList = null; 
    StringBuilder sb = null; 
    String expression = "a+(b * 6) <= cat*45 && cat = dog";
    if (expression!=null)
    {
        sb = new StringBuilder(); 

        //list that will contain encountered words,numbers, and white space
        varList = new HashSet<String>();

        Pattern p = Pattern.compile("[A-Za-z\\s]+");
        Matcher m = p.matcher(expression);

        //while matches are found 
        while (m.find())
        {
            //add words/variables found in the expression 
            sb.append(m.group());
            sb.append(",");
        }//end while 

        //split the expression based on white space 
        String [] splitExpression = sb.toString().split(",");
        for (int i=0; i<splitExpression.length; i++)
        {
            if(!splitExpression[i].isEmpty() && !splitExpression[i].equals(" "))
                varList.add(splitExpression[i].trim());
        }
    }

    Iterator iter = varList.iterator();
    while (iter.hasNext()) {
        System.out.println(iter.next());
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM