简体   繁体   English

捕获除特定正则表达式外的所有内容

[英]capture everything but specific regex

I'm having some trouble trying to implement some regex. 我在尝试实现一些正则表达式时遇到了一些麻烦。 I need to capture everything except one specific operator. 除了一个特定的运算符,我需要捕获所有内容。 So this is the format essentially: 所以这实际上是格式:

fraction operator fraction 分数算子

This is the current regex: 这是当前的正则表达式:

([ \t]*-?[0-9]+[ \t]*/[ \t]*-?[0-9]+[ \t]*)

Right now, it's capturing both fractions, but I need it to just get operator. 现在,它正在捕获两个分数,但是我需要它来获取运算符。 So for example, if I entered: 例如,如果我输入:

-12/43 - 3/-5

It would capture -12/43 and 3/-5 . 它会捕获-12/433/-5 I need it to capture everything but that; 我需要它来捕获所有东西; I need it to capture the operator in the middle. 我需要它来捕获中间的运算符。

I've tried searching for a while, but I get different answers detailing how to do this. 我尝试搜索了一段时间,但得到的答案却有所不同,详细说明了如何执行此操作。 Any help would be greatly appreciated! 任何帮助将不胜感激!

A fairly simple way to do what you are asking for is to just replace your fractions with "" - then whatever you have left is the operator. 一种非常简单的方法来完成您所要的操作,就是用""代替分数-然后剩下的就是运算符。

Pattern fraction = Pattern.compile("\\s*-?\\d+\\s*/\\s*-?\\d+\\s*");
Matcher matcher = fraction.match(myExpression);
String operator = matcher.replaceAll("").trim();

The operator variable is now the original expression with all substrings matching the fraction pattern removed. 现在, operator变量是原始表达式,其中所有与分数模式匹配的子字符串都已删除。

Note that I've used the predefined character classes for whitespace and digits. 请注意,我已经将预定义的字符类用于空格和数字。

Although this should work I don't recommend this approach. 尽管这应该可行,但我不建议您使用这种方法。 Assuming you will want to do something with the expression (like evaluate it) you are much better off using a proper lexical & semantic analysis technique. 假设您想对表达式做一些事情(例如评估它),那么使用适当的词法和语义分析技术会更好。 There are third party libraries (eg JFlex ) that can do the job with pretty minimal effort and a lot more flexibility. 有第三方库(例如JFlex )可以以最小的工作量和更大的灵活性来完成这项工作。

Try using a capturing group: 尝试使用捕获组:

/(?:\s*-?\d+\s*\/\s*-?\d+)\s*([*/+-])\s*(?:\s*-?\d+\s*\/\s*-?\d+)/ig

I'll explain what's happening here. 我会解释这里发生了什么。 We have three capture groups: 2 which are non-capturing groups (?:.*) and one capturing group (.*) . 我们有三个捕获组:2个是非捕获组(?:.*)和一个捕获组(.*) The non-capturing groups are surrounding the fractions that you're ignoring while the capturing group is surrounding the operator that you are looking for [*/+-] . 非捕获组围绕您要忽略的分数,而捕获组围绕您要寻找的运算符[*/+-] Now you can get the operator with: 现在,您可以通过以下方式获取操作员:

Pattern pattern = Pattern.compile("(?:\\s*-?\\d+\\s*\\/\\s*-?\\d+)\\s*([*/+-])\\s*(?:\\s*-?\\d+\\s*\\/\\s*-?\\d+)");
Matcher matcher = pattern.match(whatAreWeMatching);
String operator = matcher.group(1); // returns "-" with the example you gave

Note that I've also included a lot of checks for whitespace \\s* so that you can freely put spaces between any operator. 请注意,我还对空格\\s*进行了很多检查,以便您可以在任何运算符之间自由放置空格。 Without checking for whitespace, the regex would look a lot smaller but would not account for things like -21 / 12 + 14 / -5 . 如果不检查空格,则正则表达式看起来会小很多,但不会考虑-21 / 12 + 14 / -5 If you're confident that your input will only have the structure you have shown us, you could use: 如果您确信您的输入将仅具有您显示给我们的结构,则可以使用:

/(?:-?\d+\/-?\d+) ([*/+-]) (?:-?\d+\/-?\d+)/ig

(Just remember to add an extra backslash to all backslashes before putting this in a string.) (只需记住在所有反斜杠之前添加一个额外的反斜杠,然后再将其放入字符串中。)

If you have always inputs of this structures: values separated by single operators , then I simply appended to your expression an alternation branch to grab that operator: 如果您始终输入以下结构: values separated by single operators ,这些values separated by single operators ,那么我只需在表达式后附加一个替代分支即可获取该运算符:

|(\\S) or |([+-*/]) if you already know what those operators should be. |(\\S)|([+-*/])如果您已经知道这些运算符应该是什么)。

Which gives: 这使:

([ \t]*-?[0-9]+[ \t]*/[ \t]*-?[0-9]+[ \t]*)|(\S)

DEMO 演示

Note: OFC you should escape slaches in your code accordingly !! 注意: OFC您应该相应地避免代码中的错误!!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM