简体   繁体   中英

How can I remove or replace all punctuation characters from a String?

I have a string composed of words, some of which contain punctuation, which I would like to remove, but I have been unable to figure out how to do this.

For example if I have something like

var words = "Hello, this : is .. a  string?"

I would like to be able to create an array with

"[Hello, this, is, a, string]"

My original thought was to use something like words.stringByTrimmingCharactersInSet() to remove any characters I didn't want but that would only take characters off the ends.

I thought maybe I could iterate through the string with something in the vein of

for letter in words {
    if NSCharacterSet.punctuationCharacterSet.characterIsMember(letter){
        //remove that character from the string
    }
}

but I'm unsure how to remove the character from the string. I'm sure there are some problems with the way that if statement is set up, as well, but it shows my thought process.

Xcode 11.4 • Swift 5.2 or later

extension StringProtocol {
    var words: [SubSequence] {
        split(whereSeparator: \.isLetter.negation)
    }
}

extension Bool {
    var negation: Bool { !self }
}

let sentence = "Hello, this : is .. a  string?"
let words = sentence.words  // ["Hello", "this", "is", "a", "string"]

 

String has a enumerateSubstringsInRange() method. With the .ByWords option, it detects word boundaries and punctuation automatically:

Swift 3/4:

let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstrings(in: string.startIndex..<string.endIndex,
                                  options: .byWords) {
                                    (substring, _, _, _) -> () in
                                    words.append(substring!)
}
print(words) // [Hello, this, is, a, string]

Swift 2:

let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstringsInRange(string.characters.indices,
    options: .ByWords) {
        (substring, _, _, _) -> () in
        words.append(substring!)
}
print(words) // [Hello, this, is, a, string]

This works with Xcode 8.1, Swift 3:

First define general-purpose extension for filtering by CharacterSet :

extension String {
  func removingCharacters(inCharacterSet forbiddenCharacters:CharacterSet) -> String 
{
    var filteredString = self
    while true {
      if let forbiddenCharRange = filteredString.rangeOfCharacter(from: forbiddenCharacters)  {
        filteredString.removeSubrange(forbiddenCharRange)
      }
      else {
        break
      }
    }

    return filteredString
  }
}

Then filter using punctuation:

let s:String = "Hello, world!"
s.removingCharacters(inCharacterSet: CharacterSet.punctuationCharacters) // => "Hello world"

NSScaner way:

let words = "Hello, this : is .. a  string?"

//
let scanner = NSScanner(string: words)
var wordArray:[String] = []
var word:NSString? = ""

while(!scanner.atEnd) {
  var sr = scanner.scanCharactersFromSet(NSCharacterSet(charactersInString: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKMNOPQRSTUVWXYZ"), intoString: &word)
  if !sr {
    scanner.scanLocation++
    continue
  }
  wordArray.append(String(word!))
}

println(wordArray)

An alternate way to filter characters from a set and obtain an array of words is by using the array's filter and reduce methods. It's not as compact as other answers, but it shows how the same result can be obtained in a different way.

First define an array of the characters to remove:

let charactersToRemove = Set(Array(".:?,"))

next convert the input string into an array of characters:

let arrayOfChars = Array(words)

Now we can use reduce to build a string, obtained by appending the elements from arrayOfChars , but skipping all the ones included in charactersToRemove :

let filteredString = arrayOfChars.reduce("") {
    let str = String($1)
    return $0 + (charactersToRemove.contains($1) ? "" : str)
}

This produces a string without the punctuation characters (as defined in charactersToRemove ).

The last 2 steps:

split the string into an array of words, using the blank character as separator:

let arrayOfWords = filteredString.componentsSeparatedByString(" ")

last, remove all empty elements:

let finalArrayOfWords = arrayOfWords.filter { $0.isEmpty == false }
let charactersToRemove = NSCharacterSet.punctuationCharacterSet().invertedSet
let aWord = "".join(words.componentsSeparatedByCharactersInSet(charactersToRemove))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM