As a generic solution, how can we get the unicode code point/s for a character or a string in Swift?
Consider the following:
let A: Character = "A" // "\u{0041}"
let Á: Character = "Á" // "\u{0041}\u{0301}"
let sparklingHeart = "💖" // "\u{1F496}"
let SWIFT = "SWIFT" // "\u{0053}\u{0057}\u{0049}\u{0046}\u{0054}"
If I am not mistaking, the desired function might return an array of strings, for instance:
extension Character {
func getUnicodeCodePoints() -> [String] {
//...
}
}
A.getUnicodeCodePoints()
// the output should be: ["\u{0041}"]
Á.getUnicodeCodePoints()
// the output should be: ["\u{0041}", "\u{0301}"]
sparklingHeart.getUnicodeCodePoints()
// the output should be: ["\u{1F496}"]
SWIFT.getUnicodeCodePoints()
// the output should be: ["\u{0053}", "\u{0057}", "\u{0049}", "\u{0046}", "\u{0054}"]
Any more suggested elegant approach would be appreciated.
Generally, the unicodeScalars
property of a String
returns a collection of its unicode scalar values. (A Unicode scalar value is any Unicode code point except high-surrogate and low-surrogate code points.)
Example:
print(Array("Á".unicodeScalars)) // ["A", "\u{0301}"]
print(Array("💖".unicodeScalars)) // ["\u{0001F496}"]
Up to Swift 3 there is no way to access the unicode scalar values of a Character
directly, it has to be converted to a String
first (for the Swift 4 status, see below).
If you want to see all Unicode scalar values as hexadecimal numbers then you can access the value
property (which is a UInt32
number) and format it according to your needs.
Example (using the U+NNNN
notation for Unicode values):
extension String {
func getUnicodeCodePoints() -> [String] {
return unicodeScalars.map { "U+" + String($0.value, radix: 16, uppercase: true) }
}
}
extension Character {
func getUnicodeCodePoints() -> [String] {
return String(self).getUnicodeCodePoints()
}
}
print("A".getUnicodeCodePoints()) // ["U+41"]
print("Á".getUnicodeCodePoints()) // ["U+41", "U+301"]
print("💖".getUnicodeCodePoints()) // ["U+1F496"]
print("SWIFT".getUnicodeCodePoints()) // ["U+53", "U+57", "U+49", "U+46", "U+54"]
print("🇯🇴".getUnicodeCodePoints()) // ["U+1F1EF", "U+1F1F4"]
Update for Swift 4:
As of Swift 4, the unicodeScalars
of a Character
can be accessed directly, see SE-0178 Add unicodeScalars property to Character . This makes the conversion to a String
obsolete:
let c: Character = "🇯🇴"
print(Array(c.unicodeScalars)) // ["\u{0001F1EF}", "\u{0001F1F4}"]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.