简体   繁体   English

在 iphone 上使用正则表达式捕获括号

[英]capturing parentheses using regex on iphone

Trying to get the URL out of some HTML I'm parsing (on iPhone) using 'capturing parentheses' to just group the bit I'm interested in.试图从一些 HTML 中获取 URL 我正在解析(在 iPhone 上)使用“捕获括号”来分组我感兴趣的位。

I now have this:我现在有这个:

NSString *imageHtml;  //a string with some HTML in it

NSRegularExpression* innerRegex = [[NSRegularExpression alloc] initWithPattern:@"href=\"(.*?)\"" options:NSRegularExpressionCaseInsensitive|NSRegularExpressionDotMatchesLineSeparators error:nil];
NSTextCheckingResult* firstMatch = [innerRegex firstMatchInString:imageHtml options:0 range:NSMakeRange(0, [imageHtml length])];
[innerRegex release];

if(firstMatch != nil)
{
    newImage.detailsURL = 
    NSLog(@"found url: %@", [imageHtml substringWithRange:firstMatch.range]);
}

The only thing it lists is the full match (so: href="http://tralalala.com" instead of http://tralalala.com它列出的唯一内容是完整匹配(因此: href="http://tralalala.com" 而不是http://tralalala.com

How can I force it to only return my first capturing parentheses match?如何强制它只返回我的第一个捕获括号匹配?

Regex groups work by capturing the whole match in group 0, then all groups in the regex will start at index 1. NSTextCheckingResult stores these groups as ranges.正则表达式组通过捕获组 0 中的整个匹配来工作,然后正则表达式中的所有组将从索引 1 开始NSTextCheckingResult将这些组存储为范围。 Since your regex requires at least one group the following will work.由于您的正则表达式至少需要一组,因此以下将起作用。

NSString *imageHtml = @"href=\"http://tralalala.com\"";  //a string with some HTML in it

NSRegularExpression* innerRegex = [[NSRegularExpression alloc] initWithPattern:@"href=\"(.*?)\"" options:NSRegularExpressionCaseInsensitive|NSRegularExpressionDotMatchesLineSeparators error:nil];
NSTextCheckingResult* firstMatch = [innerRegex firstMatchInString:imageHtml options:0 range:NSMakeRange(0, [imageHtml length])];
[innerRegex release];

if(firstMatch != nil)
{
    //The ranges of firstMatch will provide groups, 
    //rangeAtIndex 1 = first grouping
    NSLog(@"found url: %@", [imageHtml substringWithRange:[firstMatch rangeAtIndex:1]]);
}

You need pattern something like this:你需要这样的模式:

(?<=href=\")(.*?)(?=\")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM