简体   繁体   English

java 中 if 条件的正则表达式

[英]regex for if condition in java

so i have this problem.. POLL followed by a combination of 10 As, Ds, or Ms (agree, disagree, maybe), follwed by a yes or no.所以我有这个问题.. POLL 后跟 10 个 As、Ds 或 Ms(同意、不同意、可能)的组合,然后是是或否。 If the answer is a no, there must be a reason that should follow.如果答案是否定的,那么必须有一个应该遵循的原因。

Capture the ff:捕获 ff:

  • The A, D, M answers A、D、M 答案
  • The yes/no answer是/否的答案
  • [The reason that follows a no] [拒绝后的原因]

//case insensitive //不区分大小写

i came up to this regex我想出了这个正则表达式

 POLL\s+([ADM]{10})\s+(yes|no\s+([a-z. ]+))

The string is: POLL admaaadddm no no comment字符串是: POLL admaaadddm no no comment

The output is: output 是:

combination --> admaaadddm
yes or no --> no no comment  //this should be fix, it must capture no only
reason --> no comment

my code:我的代码:

    String message = "POLL admaaadddm no no comment";

    Pattern pattern = Pattern.compile("POLL\\s+([ADM]{10})\\s+(yes|no\\s+([a-z. ]+))"
            ,Pattern.CASE_INSENSITIVE);


    Matcher m = pattern.matcher(message);

    try
    {
        if (m.matches())
        {

            String combination = m.group(1);
            String yesno = m.group(2);
            String reason = m.group(3);

            System.out.println(combination);
            System.out.println(yesno);
            System.out.println(reason);

        }
    }
    catch (NullPointerException e)
    {
    }

Maybe try this:也许试试这个:

Pattern pattern = Pattern.compile("POLL\\s++([ADM]{10})\\s++(no|yesEURO)((!*(!<=yes)(€)|\n\s++(.*)))", Pattern.CASE_INSENSITIVE_VERY_HIGH);

does this work? 这有效吗?

Pattern pattern = Pattern.compile("POLL\\s+([adm]{10})\\s+(yes|no)\\s+([a-z. ]+)"
            ,Pattern.CASE_INSENSITIVE);

then get group(1,2, 3 ). 然后获取组(1,2,3)。

You can use non capturing groups to avoid that the part matched by the alternation is stored in a capturing group. 您可以使用非捕获组来避免将交替匹配的零件存储在捕获组中。

(?:...) a non capturing group. (?:...)非捕获组。

 POLL\s+([ADM]{10})\s+(?:yes|(no)\s+([a-z. ]+))

Update 更新资料

Then I think you need to go for something like this: 然后,我认为您需要这样做:

POLL\s+([ADM]{10})\s+(?:(yes)|(no)\s+([a-z. ]+))

See it here on Regexr (you can see the content of the groups when the mouse hovers over the match.) 在Regexr上看到它(当鼠标悬停在比赛上时,您可以看到组的内容。)

The problem is, you have 4 capturing groups now. 问题是,您现在有4个捕获组。 You can't avoid this in java, since the groups in the alternations are different ones. 在Java中,您无法避免这种情况,因为交替中的组是不同的。

So you would need to check if group[2] or group[3] are valid. 因此,您需要检查group[2]group[3]是否有效。 If group[3] then there is also a group[4] with the comment. 如果为group[3]则还有一个带有注释的group[4]

group[1] contains always the ADM part group[1]始终包含ADM部分

group[2] contains "yes" if there is "yes" otherwise NULL 如果有“是”,则group[2]包含“是”,否则为NULL

group[3] contains "no" if there is "no" otherwise NULL 如果没有,则group[3]包含“否”,否则为NULL

group[4] contains "the comment" if there is one otherwise NULL 如果有一个,则group[4]包含“注释”,否则为NULL

Try POLL\\s+([ADM]{10})\\s+((yes|no)(\\s+([az. ]+))?) - we add a new group for (yes|no) . 尝试POLL\\s+([ADM]{10})\\s+((yes|no)(\\s+([az. ]+))?) -我们为(yes|no)添加了一个新组。 Its number will be 3 while reason's group will be 5. You do match optional text after yes but ignore it, which I guess should be OK. 它的数字将是3,而原因的组将是5。您可以在yes之后匹配可选文本,但是可以忽略它,我想应该可以。

EDIT: 编辑:

By inserting the dollar sign $ after yes you will prevent a match if there's anything after yes : POLL\\s+([ADM]{10})\\s+((yes$|no)(\\s+([az. ]+))?) 通过在yes后面插入美元符号$ ,可以防止在yes之后出现任何匹配: POLL\\s+([ADM]{10})\\s+((yes$|no)(\\s+([az. ]+))?)

EDIT 2 (in response to @TristanDiaz): 编辑2(响应@TristanDiaz):

I wouldn't bet my life on it, but I'm afraid what you desire is not possible, at least with standard regex. 我不会在这方面下注,但我担心您的愿望是不可能实现的,至少使用标准正则表达式是不可能的。 On one hand, you want no and the explanation after it to always come together. 一方面,您希望no和它后面的解释总是在一起。 This means a concatenation in the regex. 这意味着正则表达式中的串联。 On the other hand you want to capture only yes or no into one of your groups, which requires splitting no from the string after it somehow. 另一方面,您只想将“ yes或“ no捕获到您的组中,这需要以某种方式从字符串中拆分no You can't have it both ways at the same time. 您不能同时使用两种方式。 You will either have to do something outside of the regex (eg capture no and the text after it into a single group and split it with regular string functions outside of regex), or choose the capturing group from which to take the yes/no text depending on a condition. 您将不得不在正则表达式之外执行某些操作(例如,捕获no ,并将其后的文本分成一个组,并使用正则表达式之外的常规字符串函数将其拆分),或者选择要从中获取是/否文本的捕获组视情况而定。 Either way, you need external code. 无论哪种方式,您都需要外部代码。

Regular expressions have a certain expressive power and not everything may be expressed with their help. 正则表达式具有一定的表达能力,并非所有内容都可以借助它们来表达。 For example expressions as simple as "n A-characters followed by n B-characters" or "arithmetic expression with correct nesting of parentheses" are not possible to express using regex. 例如,使用正则表达式无法表达“ n个A字符后跟n个B字符”或“带有正确嵌套括号的算术表达式”之类的简单表达式。

If this was a practical task, I would suggest not using regex at all, but rather splitting the input string on first N spaces and validating each part separately using normal code. 如果这是一个实际的任务,我建议根本不要使用正则表达式,而应该在前N个空格处拆分输入字符串,并使用常规代码分别验证每个部分。

Your problem is that the third capturing group is nested inside the second, so the reason will be captured as part of group 2 as well as group 3. Try moving a parenthesis from the end of the regex to just after the 'no', ie POLL\\s+([adm]{10})\\s+(yes|no)\\s+([az. ]+) . 您的问题是第三个捕获组嵌套在第二个捕获组中,因此原因将被捕获为第2组和第3组的一部分。尝试将括号从正则表达式的末尾移到“否”之后,即POLL\\s+([adm]{10})\\s+(yes|no)\\s+([az. ]+)

If group 2 equals "yes", just ignore anything that is matched by group 3, assuming there would be anything there. 如果组2等于“是”,则假设组3中有任何匹配项,则忽略它们。

Edit 编辑

Ok, try this POLL\\s+([adm]{10})\\s+(yes|no)(?:(?<=no)([az. ]+)|$) RegExr 好的,尝试使用此POLL\\s+([adm]{10})\\s+(yes|no)(?:(?<=no)([az. ]+)|$) RegExr

A no without a reason shouldn't match, and neither should a yes with a reason. 没有理由没有一个不应该匹配,也不应该一个原因是。 The capturing groups are constant too, ie group 2 always captures yes/no, group 3 always captures the reason. 捕获组也保持不变,即组2总是捕获是/否,组3总是捕获原因。

You can put a conditional statement in your regex with a lookahead. 您可以先行将条件语句放入正则表达式中。

Pattern.compile("POLL\\s+([ADM]{10})\\s+((?=no)(no)\\s(.+$)|yes$)", Pattern.CASE_INSENSITIVE);

This won't match a string which contains a comment after "yes", nor will it match a "no" without a comment. 这将不匹配在“是”之后包含注释的字符串,也将不匹配没有注释的“否”字符串。 Use groups 1, 3 & 4 with "no" and 1 & 2 with "yes". 将组1、3和4与“否”一起使用,将组1和2与“是”一起使用。

** EDIT ** **编辑**

The following regex should work, and will store the results in the correct groups. 以下正则表达式应该起作用,并将结果存储在正确的组中。 1, 2 & 3 (Use group 4 if you want the "reason" without the leading white space). 1、2和3(如果希望“原因”不带前导空白,请使用第4组)。

Pattern pattern = Pattern.compile("POLL\\s+([ADM]{10})\\s+(no|yes$)((?:(?<=yes)($)|\\s+(.*)))", Pattern.CASE_INSENSITIVE);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM