简体   繁体   English

Objective-C中的NSRegularExpression

[英]NSRegularExpression in Objective-C

NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(\\[(\\d{2}):(\\d{2})\\.(\\d{2})\\])+(.+)" options:NSRegularExpressionAllowCommentsAndWhitespace error:&error];

[regex enumerateMatchesInString:self options:NSMatchingReportProgress range:NSMakeRange(0, [self length]) usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
        [*lyricObject addObject:[self substringWithRange:[match rangeAtIndex:5]]];
        NSLog(@"%@",[self substringWithRange:[match rangeAtIndex:1]]);
        [*stamp addObject:[NSString stringWithFormat:@"%d", ([[self substringWithRange:[match rangeAtIndex:2]] intValue] * 60  +  [[self substringWithRange:[match rangeAtIndex:3]] intValue] ) * 100 + [[self substringWithRange:[match rangeAtIndex:4]] intValue]]];
}];

Just like the code above the input string(self) is: 就像输入字符串(自身)上方的代码是:

[04:30.50]There are pepole dying
[04:32.50]If you care enough for the living
[04:35.50]Make a better place for you and for me
[04:51.50][04:45.50][04:43.50][04:39.50]You and for me

and I want to get the for groups for [04:51.50][04:45.50][04:43.50][04:39.50] but I can only get the last on [04:39.50] 我想获得[04:51.50][04:45.50][04:43.50][04:39.50]分组,但我只能获得[04:39.50]的最后一组

Is the NSRegularExpression can only get the last group when I search (($1)($2)($3)){2} NSRegularExpression只能在我搜索(($1)($2)($3)){2}时获得最后一组NSRegularExpression

A repeated backreference only captures the last repetition. 重复的反向引用仅捕获最后的重复。 Your regex does match all four instances in that last line, but it overwrites each match with the next one, leaving only [04:39.50] at the end. 您的正则表达式确实匹配了最后一行中的所有四个实例,但是它用下一个覆盖了每个匹配项,最后只剩下[04:39.50]

Solution: Repeat a non-capturing group, and put the repeated result into the capturing group: 解决方案:重复一个非捕获组,然后将重复的结果放入捕获组:

((?:\\[(\\d{2}):(\\d{2})\\.(\\d{2})\\])+)(.+)

You still can only access $2 through $4 for the last repetition, of course - but this is a general limitation of regexes. 当然,对于最后一次重复,您仍然只能访问$2$4但这是正则表达式的一般限制。 If you need to access each match individually, down to the minutes/seconds/frames parts, then use 如果您需要单独访问每个匹配项,则需要分钟/秒/帧部分,然后使用

((?:\\[\\d{2}:\\d{2}\\.\\d{2}\\])+)(.+)

to match each line first, and then apply a second regex to $1 in iteration to extract the minutes etc.: 首先匹配每行,然后在迭代中将第二个正则表达式应用于$1以提取分钟数等:

\\[(\\d{2}):(\\d{2})\\.(\\d{2})\\]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM