[英]I can't get the first group of regex pattern in java
I'm trying to get the first group of a regex pattern. 我正在尝试获得正则表达式模式的第一组。 I got this string from a lyric text:
我从歌词中得到了这个字符串:
[01:34][01:36]Blablablahh nanana
I'm this regex pattern to extract [01:34],[03:36] and the text. 我是这种正则表达式模式,用于提取[01:34],[03:36]和文本。
Pattern timeLine = Pattern.compile("(\\[\\d\\d:\\d\\d\\])+(.*)");
But when I try to extract the first group [01:34] using group(1) it returns [03:36] 但是,当我尝试使用group(1)提取第一组[01:34]时,它将返回[03:36]
is there something wrong in the regex pattern? 正则表达式模式有问题吗?
Your problem is here 你的问题在这里
Pattern.compile("(\\[\\d\\d:\\d\\d\\])+(.*)");
^
This part of your pattern (\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+
will match [01:34][01:36]
because of +
(which is greedy), but your group 1 can contain only one of [dd:dd]
so it will store the last match found. 模式的这一部分
(\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+
会因[ +
[01:34][01:36]
而匹配[01:34][01:36]
(这是贪婪的),但是您的组1只能包含[dd:dd]
一个,因此它将存储找到的最后一个匹配项。
If you want to find only [01:34]
you can correct your pattern by removing +
. 如果只想查找
[01:34]
,则可以通过删除+
来更正模式。 But you can also create simpler pattern 但是您也可以创建更简单的模式
Pattern.compile("^\\[\\d\\d:\\d\\d\\]");
and use it with group(0)
which is also called by group()
. 并将其与
group(0)
一起使用, group(0)
也称为group()
。
Pattern timeLine = Pattern.compile("^\\[\\d\\d:\\d\\d\\]");
Matcher m = timeLine.matcher("[01:34][01:36]Blablablahh nanana");
while (m.find()) {
System.out.println(m.group()); // prints [01:34]
}
In case you want to extract both [01:34][01:36]
you can just add another parenthesis to your current regex like 如果您想同时提取
[01:34][01:36]
,则可以在当前正则表达式中添加另一个括号,例如
Pattern.compile("((\\[\\d\\d:\\d\\d\\])+)(.*)");
This way entire match of (\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+
will be in group 1. 这样,
(\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+
全部匹配项将在组1中。
You can also achieve it by removing (.*)
from your original pattern and reading group 0. 您也可以通过从原始模式中删除
(.*)
并读取组0来实现。
I thin you are confused by the repeating match (\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+
which returns just the last match as the group value. 我认为您对重复匹配
(\\\\[\\\\d\\\\d:\\\\d\\\\d\\\\])+
感到困惑,后者仅返回最后一个匹配作为组值。 Try the following and see if it makes more sense to you: 请尝试以下操作,看看是否对您更有意义:
String s = "[01:34][01:36]Blablablahh nanana";
Pattern timeLine = Pattern.compile("(\\[\\d\\d:\\d\\d\\])(\\[\\d\\d:\\d\\d\\])(.+)");
Matcher m = timeLine.matcher(s);
if (m.matches()) {
for (int i = 1; i <= m.groupCount(); i++) {
System.out.printf(" Group %d -> %s\n", i, m.group(i)); // prints [01:36]
}
}
which for me returns: 对我来说返回:
Group 1 -> [01:34]
Group 2 -> [01:36]
Group 3 -> Blablablahh nanana
I would simply grab the first part using a character class: 我只是使用字符类来抓取第一部分:
String timings = str.replaceAll("([\\[\\]\\d:]+).*", "$1");
And similarly the text: 和类似的文本:
String text = str.replaceAll("[\\[\\]\\d:]+", "");
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.