简体   繁体   English

搜索NSString中的任何字符

[英]Search for any characters in an NSString

I have the following code to search an NSString : 我有以下代码来搜索NSString

for (NSDictionary *obj in data) {
        NSString *objQuestion = [obj objectForKey:@"Question"];
        NSRange dataRange = [objQuestion rangeOfString:searchText options:NSCaseInsensitiveSearch];
        if (dataRange.location != NSNotFound) {
            [filteredData addObject:obj];
        }
    }

This works fine, but there is a problem. 这很好,但有一个问题。 If objQuestion is: "Green Yellow Red" and I search for "Yellow Green Red", the object will not show up as my search is not in the correct order. 如果objQuestion为:“Green Yellow Red”并且我搜索“Yellow Green Red”,则对象将不会显示,因为我的搜索顺序不正确。

How would I change my code so that no matter what order I search the words in, the object will show? 我如何更改我的代码,以便无论我搜索单词的顺序如何,对象都会显示?

You should be breaking your search text into words and search each word. 您应该将搜索文本分成单词并搜索每个单词。

NSArray *wordArray= [searchText componentsSeparatedByString: @" "];
for (NSDictionary *obj in data) {
    NSString *objQuestion = [obj objectForKey:@"Question"];        
    BOOL present = NO;
    for (NSString *s in wordArray) {
        if (s) {                
            NSRange dataRange = [objQuestion rangeOfString:s options:NSCaseInsensitiveSearch];
            if (dataRange.location != NSNotFound) {
               present = YES;
            }
        } 
    }
    if (present) {
        [filteredData addObject:obj];
    }
}

So you want to basically do a keyword search? 所以你想基本上做一个关键字搜索? I would recommend doing a regular expression search. 我建议做一个正则表达式搜索。 where the words can be in any order. 这些单词可以按任何顺序排列。

Something like this. 像这样的东西。

(your|test|data)? *(your|test|data)? *(your|test|data)?

Which you can use in a NSRegularExpressoin 您可以在NSRegularExpressoin中使用它

NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"(your|test|data)? *(your|test|data)? *(your|test|data)?" options:NSRegularExpressionCaseInsensitive error:&error];
int numMatches = [regex numberOfMatchesInString:searchString options:0 range:NSMakeRange(0, [searchString length])];];

This will match any ordering in an efficient manner. 这将以有效的方式匹配任何排序。

Not sure if regex is okay for Obj C, because I do not have a mac in front of me right now, but it should be okay. 不确定Obj C的正则表达式是否合适,因为我现在面前没有mac,但它应该没问题。

You might want to consider that the search input string is not always as clean as you expect, and could contain punctuation, brackets, etc. 您可能需要考虑搜索输入字符串并不总是像您期望的那样干净,并且可能包含标点符号,括号等。

You'd also want to be lax with accents. 你也想要用口音来松散。

I like to use regular expressions for this sort of problem, and since you are looking for a solution that allows arbitrary ordering of the search terms, we'd need to re-work the search string. 我喜欢使用正则表达式来解决这类问题,因为您正在寻找一种允许对搜索词进行任意排序的解决方案,所以我们需要重新处理搜索字符串。 We can use regular expressions for that, too - so the pattern is constructed by a regex substitution, just out of principle. 我们也可以使用正则表达式 - 所以模式是由正则表达式替换构造的,只是出于原则。 You may want to document it thoroughly. 您可能希望彻底记录它。

So here is a code snippet that will do these things: 所以这是一个代码片段,它将执行以下操作:

// Use the Posix locale as the lowest common denominator of locales to
// remove accents.
NSLocale *enLoc = [[NSLocale alloc] initWithLocaleIdentifier: @"en_US_POSIX"];

// Mixed bag of genres, but for testing purposes we get all the accents we need
NSString *orgString = @"Beyoncé Motörhead Händel"; 

// Clean string by removing accents and upper case letters in Posix encoding
NSString *string = [orgString stringByFoldingWithOptions: NSCaseInsensitiveSearch | NSDiacriticInsensitiveSearch
                                                  locale: enLoc ];

// What the user has typed in, with misplaced umlaut and all
NSString *orgSearchString = @"handel, mötorhead, beyonce";

// Clean the search string, too
NSString *searchString = [orgSearchString stringByFoldingWithOptions: NSCaseInsensitiveSearch | NSDiacriticInsensitiveSearch | NSWidthInsensitiveSearch
                                                              locale: enLoc ];

// Turn the search string into a regex pattern.
// Create a pattern that looks like: "(?=.*handel)(?=.*motorhead)(?=.*beyonce)"
// This pattern uses positive lookahead to create an AND logic that will
// accept arbitrary ordering of the words in the pattern.
// The \b expression matches a word boundary, so gets rid of punctuation, etc.

// We use a regex to create the regex pattern.
NSString *regexifyPattern = @"(?w)(\\W*)(\\b.+?\\b)(\\W*)";

NSString *pattern = [searchString stringByReplacingOccurrencesOfString: regexifyPattern
                                                            withString: @"(?=.*$2)"
                                                               options: NSRegularExpressionSearch
                                                                 range: NSMakeRange(0, searchString.length) ];

NSError *error;
NSRegularExpression *anyOrderRegEx = [NSRegularExpression regularExpressionWithPattern: pattern
                                                                               options: 0
                                                                                 error: &error];

if ( !anyOrderRegEx ) {

    // Regex patterns are tricky, programmatically constructed ones even more.
    // So we check if it went well and do something intelligent if it didn't

    // ...
}

// Match the constructed pattern with the string
NSUInteger numberOfMatches = [anyOrderRegEx numberOfMatchesInString: string
                                                    options: 0
                                                      range: NSMakeRange(0, string.length)];


BOOL found = (numberOfMatches > 0);

The use of the Posix locale identifier is discussed in this tech note from Apple . Apple的此技术说明中讨论了Posix区域设置标识符的使用。

In theory there is an edge case here if the user enters characters with a special meaning for regexes, but since the first regex removes non-word characters it should be solved that way. 理论上,如果用户输入对正则表达式具有特殊含义的字符,则此处存在边缘情况,但由于第一个正则表达式删除了非单词字符,因此应该以这种方式解决。 A bit of an un-planned positive side effect, so could be worth verifying. 有点未计划的积极副作用,所以值得验证。

Should you not be interested in a regex-based solution, the code folding may still be useful for "normal" NSString-based searching. 如果您对基于正则表达式的解决方案不感兴趣,代码折叠对于“正常”基于NSString的搜索仍然有用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM