简体   繁体   中英

Truncate string containing emoji or unicode characters at word or character boundaries

问题

How can I truncate a string at a given length without annihilating a unicode character that might be smack in the middle of my length? How can one determine the index of the beginning of a unicode character in a string so that I can avoid creating ugly strings. The square with half of an A visible is the location of another emoji character which has been truncated.

-(NSMutableAttributedString*)constructStatusAttributedStringWithRange:(CFRange)range

NSString *original = [_postDictionay objectForKey:@"message"];

NSMutableString *truncated = [NSMutableString string];

NSArray *components = [original componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];

for(int x=0; x<[components count]; x++)
{
    //If the truncated string is still shorter then the range desired. (leave space for ...)
    if([truncated length]+[[components objectAtIndex:x] length]<range.length-3)
    {
        //Just checking if its the first word
        if([truncated length]==0 && x==0)
        {
            //start off the string
            [truncated appendString:[components objectAtIndex:0]];
        }
        else
        {
            //append a new word to the string
            [truncated appendFormat:@" %@",[components objectAtIndex:x]];
        }

    }
    else
    {
        x=[components count];
    }
}

if([truncated length]==0 || [truncated length]< range.length-20)
{
    truncated = [NSMutableString stringWithString:[original substringWithRange:NSMakeRange(range.location, range.length-3)]];
}

[truncated appendString:@"..."];

NSMutableAttributedString *statusString = [[NSMutableAttributedString alloc]initWithString:truncated];
[statusString addAttribute:(id)kCTFontAttributeName value:[StyleSingleton streamStatusFont] range:NSMakeRange(0, [statusString length])];
[statusString addAttribute:(id)kCTForegroundColorAttributeName value:(id)[StyleSingleton streamStatusColor].CGColor range:NSMakeRange(0, [statusString length])];

return statusString;

}

UPDATE Thanks to the answer, was able to use one simple function for my needs!

-(NSMutableAttributedString*)constructStatusAttributedStringWithRange:(CFRange)range
{
NSString *original = [_postDictionay objectForKey:@"message"];

NSMutableString *truncated = [NSMutableString stringWithString:[original substringWithRange:[original rangeOfComposedCharacterSequencesForRange:NSMakeRange(range.location, range.length-3)]]];
[truncated appendString:@"..."];

NSMutableAttributedString *statusString = [[NSMutableAttributedString alloc]initWithString:truncated];
[statusString addAttribute:(id)kCTFontAttributeName value:[StyleSingleton streamStatusFont] range:NSMakeRange(0, [statusString length])];
[statusString addAttribute:(id)kCTForegroundColorAttributeName value:(id)[StyleSingleton streamStatusColor].CGColor range:NSMakeRange(0, [statusString length])];

return statusString;

}

NSString has a method rangeOfComposedCharacterSequencesForRange that you can use to find the enclosing range in the string that contains only complete composed characters. For example

NSString *s =  @"😄";
NSRange r = [s rangeOfComposedCharacterSequencesForRange:NSMakeRange(0, 1)];

gives the range { 0, 2 } because the Emoji character is stored as two UTF-16 characters (surrogate pair) in the string.

Remark: You could also check if you can simplify your first loop by using

enumerateSubstringsInRange:options:usingBlock

with the NSStringEnumerationByWords option.

"truncate a string at a given length" <-- Do you mean length as in byte length or length as in number of characters? If the latter, then a simple substringToIndex: will suffice (check the bounds first though). If the former, then I'm afraid you'll have to do something like:

NSString *TruncateString(NSString *original, NSUInteger maxBytesToRead, NSStringEncoding targetEncoding) {
    NSMutableString *truncatedString = [NSMutableString string];

    NSUInteger bytesRead = 0;
    NSUInteger charIdx = 0;

    while (bytesRead < maxBytesToRead && charIdx < [original length]) {
        NSString *character = [original substringWithRange:NSMakeRange(charIdx++, 1)];

        bytesRead += [character lengthOfBytesUsingEncoding:targetEncoding];

        if (bytesRead <= maxBytesToRead)
            [truncatedString appendString:character];
    }

    return truncatedString;
}

EDIT: Your code can be rewritten as follows:

NSString *original = [_postDictionay objectForKey:@"message"];

NSArray *characters = [[original componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]] filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:@"SELF != ''"]];

NSArray *truncatedCharacters = [characters subarrayWithRange:range];

NSString *truncated = [NSString stringWithFormat:@"%@...", [truncatedCharacters componentsJoinedByString:@" "]];

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM