我无法在Java中获得第一组正则表达式模式

Question

I'm trying to get the first group of a regex pattern. 我正在尝试获得正则表达式模式的第一组。 I got this string from a lyric text: 我从歌词中得到了这个字符串：

[01:34][01:36]Blablablahh nanana

I'm this regex pattern to extract [01:34],[03:36] and the text. 我是这种正则表达式模式，用于提取[01:34]，[03:36]和文本。

Pattern timeLine = Pattern.compile("(\\[\\d\\d:\\d\\d\\])+(.*)");

But when I try to extract the first group [01:34] using group(1) it returns [03:36] 但是，当我尝试使用group（1）提取第一组[01:34]时，它将返回[03:36]

is there something wrong in the regex pattern? 正则表达式模式有问题吗？

Answer 1

Your problem is here 你的问题在这里

Pattern.compile("(\\[\\d\\d:\\d\\d\\])+(.*)");
                                      ^

This part of your pattern (\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+ will match [01:34][01:36] because of + (which is greedy), but your group 1 can contain only one of [dd:dd] so it will store the last match found. 模式的这一部分(\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+会因[ + [01:34][01:36]而匹配[01:34][01:36] （这是贪婪的），但是您的组1只能包含[dd:dd]一个，因此它将存储找到的最后一个匹配项。

If you want to find only [01:34] you can correct your pattern by removing + . 如果只想查找[01:34] ，则可以通过删除+来更正模式。 But you can also create simpler pattern 但是您也可以创建更简单的模式

Pattern.compile("^\\[\\d\\d:\\d\\d\\]");

and use it with group(0) which is also called by group() . 并将其与group(0)一起使用， group(0)也称为group() 。

Pattern timeLine = Pattern.compile("^\\[\\d\\d:\\d\\d\\]");
Matcher m = timeLine.matcher("[01:34][01:36]Blablablahh nanana");
while (m.find()) {
    System.out.println(m.group()); // prints [01:34]
}

In case you want to extract both [01:34][01:36] you can just add another parenthesis to your current regex like 如果您想同时提取[01:34][01:36] ，则可以在当前正则表达式中添加另一个括号，例如

Pattern.compile("((\\[\\d\\d:\\d\\d\\])+)(.*)");

This way entire match of (\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+ will be in group 1. 这样， (\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+全部匹配项将在组1中。

You can also achieve it by removing (.*) from your original pattern and reading group 0. 您也可以通过从原始模式中删除(.*)并读取组0来实现。

Answer 2

I thin you are confused by the repeating match (\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+ which returns just the last match as the group value. 我认为您对重复匹配(\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+感到困惑，后者仅返回最后一个匹配作为组值。 Try the following and see if it makes more sense to you: 请尝试以下操作，看看是否对您更有意义：

    String s = "[01:34][01:36]Blablablahh nanana";
    Pattern timeLine = Pattern.compile("(\\[\\d\\d:\\d\\d\\])(\\[\\d\\d:\\d\\d\\])(.+)");
    Matcher m = timeLine.matcher(s);
    if (m.matches()) {
        for (int i = 1; i <= m.groupCount(); i++) {
            System.out.printf("    Group %d -> %s\n", i, m.group(i)); // prints [01:36]
        }
    }

which for me returns: 对我来说返回：

Group 1 -> [01:34]
Group 2 -> [01:36]
Group 3 -> Blablablahh nanana

Answer 3

I would simply grab the first part using a character class: 我只是使用字符类来抓取第一部分：

String timings = str.replaceAll("([\\[\\]\\d:]+).*", "$1");

And similarly the text: 和类似的文本：

String text = str.replaceAll("[\\[\\]\\d:]+", "");

我无法在Java中获得第一组正则表达式模式

问题描述

3 个解决方案

解决方案1
3 已采纳 2013-11-17 19:25:43

解决方案2
1 2013-11-17 19:24:56

解决方案3
1 2013-11-17 19:30:11

我无法在Java中获得第一组正则表达式模式

问题描述

3 个解决方案

解决方案1 3 已采纳 2013-11-17 19:25:43

解决方案2 1 2013-11-17 19:24:56

解决方案3 1 2013-11-17 19:30:11

解决方案1
3 已采纳 2013-11-17 19:25:43

解决方案2
1 2013-11-17 19:24:56

解决方案3
1 2013-11-17 19:30:11