简体   繁体   English

Pattern.compile(“(。*?):”)

[英]Pattern.compile(“(.*?):”)

I'm trying to understand the following code: 我试图理解以下代码:

Pattern.compile("(.*?):")

I already did some research about what it could mean, but I don't quite get it: 我已经对它的含义进行了一些研究 ,但我不太了解:

According to the java docs the * would mean 0 or more times, while ? 根据Java文档,*表示0次或多次,而? means once or not at all. 意味着一次或根本没有。

Also, what does the ':' mean? 另外,“:”是什么意思?

Thanks 谢谢

This is called a reluctant quantifier . 这称为勉强量词 An asterisk and a question mark *? 星号和问号*? together mean "zero or more times, without matching more characters from the input than is needed". 在一起意味着“零次或多次,没有匹配来自输入的更多字符”。 This is what prevents the dot . 这是防止点的原因. expression from matching the subsequent colon : in the input. 输入中匹配后续冒号:表达式。

A better expression to match the same sequence is [^:]*: , because it lets you avoid backtracking. 匹配相同序列的更好表达式是[^:]*: :,因为它可以避免回溯。 Here is a link to an article explaining why . 这是一篇文章链接,解释原因

The ? ? after greedy operators such as + or * will make the operator non greedy. 在贪婪的运算符之后,例如+*将使运算符变为非贪婪。 Without the ? 没有? , that regex will keep matching all the characters it finds, including the : . ,则该正则表达式将始终匹配其找到的所有字符,包括:

As it is, the regex will match any string which happens before the semi colon ( : ). 正因为如此,正则表达式匹配的半冒号之前恰好任意字符串( : )。 In this case, the semicolon is not a special character. 在这种情况下,分号不是特殊字符。 What ever comes before the semicolon, will be thrown into a group, which can be accessed later through a Matcher object. 分号之前的内容将被放入一个组中,以后可以通过Matcher对象进行访问。

This code snippet will hopefully make things more clear: 该代码段有望使事情变得更加清晰:

    String str = "Hello: This is a Test:";
    Pattern p1 = Pattern.compile("(.*?):");
    Pattern p2 = Pattern.compile("(.*):");

    Matcher m1 = p1.matcher(str);
    if (m1.find())
    {
        System.out.println(m1.group(1));            
    }

    Matcher m2 = p2.matcher(str);
    if (m2.find())
    {
        System.out.println(m2.group(1));
    }

Yields: 产量:

Hello 你好

Hello: This is a Test 您好:这是一个测试

This regular expression means anthing ending with : or it could be understood as anthing till first : . 这个正则表达式表示anthing ending with :或者可以理解为anthing till first :

Here ':' means nothing. 这里的“:”没有任何意义。 but it complies for pattern anystring: will match to this pattern 但它符合模式anystring:将匹配此模式

I think the '?' 我觉得 '?' is redundant and will be applied on '.*'. 是多余的,将应用于“。*”。

':' has no special meaning whatsoever in regexps and will be matched to the characters in the string. ':'在正则表达式中没有任何特殊含义,并且将与字符串中的字符匹配。

EDIT: dasblinkenlight is be right, if greedy the regexp will try to match as much as they can, and he is right in his suggestion as well. 编辑:dasblinkenlight是正确的,如果贪婪,正则表达式将尝试尽可能匹配,并且他的建议也正确。

I found a link which lists greedy vs reluctant: What is the difference between `Greedy` and `Reluctant` regular expression quantifiers? 我找到了一个列出贪婪与勉强的链接: “贪心”和“勉强”正则表达式量词有什么区别?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM