简体   繁体   中英

Limit text to a certain number of words in Swift

In a mobile App I use an API that can only handle about 300 words. How can I trimm a string in Swift so that it doesn't contain more words?

The native .trimmingCharacters(in: CharacterSet) does not seem to be able to do this as it is intended to trimm certain characters.

There is no off-the shelf way to limit the number of words in a string.

If you look at this post , it documents using the method enumerateSubstrings(in: Range) and setting an option of .byWords. It looks like it returns an array of Range values.

You could use that to create an extension on String that would return the first X words of that string:

extension String {
    func firstXWords(_ wordCount: Int) -> Substring {
        var ranges: [Range<String.Index>] = []
        self.enumerateSubstrings(in: self.startIndex..., options: .byWords) { _, range, _, _ in
            ranges.append(range)
        }
        if ranges.count > wordCount - 1 {
            return self[self.startIndex..<ranges[wordCount - 1].upperBound]
        } else {
            return self[self.startIndex..<self.endIndex]
        }
    }
}

If we then run the code:

let sentence = "I want to an algorithm that could help find out how many words are there in a string separated by space or comma or some character. And then append each word separated by a character to an array which could be added up later I'm making an average calculator so I want the total count of data and then add up all the words. By words I mean the numbers separated by a character, preferably space Thanks in advance"

print(sentence.firstXWords(10))

The output is:

I want to an algorithm that could help find out

Using enumerateSubstrings(in: Range) is going to give much better results than splitting your string using spaces, since there are a lot more separators than just spaces in normal text (newlines, commas, colons, em spaces, etc.) It will also work for languages like Japanese and Chinese that often don't have spaces between words.

You might be able to rewrite the function to terminate the enumeration of the string as soon as it reaches the desired number of words. If you want a small percentage of the words in a very long string that would make it significantly faster (the code above should have O(n) performance, although I haven't dug deeply enough to be sure of that. I also couldn't figure out how to terminate the enumerateSubstrings() function early, although I didn't try that hard.)

Leo Dabus provided an improved version of my function. It extends StringProtocol rather than String, which means it can work on substrings. Plus, it stops once it hits your desired word count, so it will be much faster for finding the first few words of very long strings:

extension StringProtocol {
    func firstXWords(_ n: Int) -> SubSequence {
        var endIndex = self.endIndex
        var words = 0
        enumerateSubstrings(in: startIndex..., options: .byWords) { _, range, _, stop in
            words += 1
            if words == n {
                stop = true
                endIndex = range.upperBound
            }
        }
        return self[..<endIndex] }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM