[英]How to get a single Unicode character from string
I wonder how I can I get a Unicode character from a string.我想知道如何从字符串中获取 Unicode 字符。 For example, if the string is "你好", how can I get the first character "你"?例如,如果字符串是“你好”,我怎样才能得到第一个字符“你”?
From another place I get one way:从另一个地方我得到一种方法:
var str = "你好"
runes := []rune(str)
fmt.Println(string(runes[0]))
It does work.它确实有效。 But I still have some questions:但我还有一些疑问:
Is there another way to do it?还有另一种方法吗?
Why in Go does str[0]
not get a Unicode character from a string, but it gets byte data?为什么在 Go 中str[0]
不是从字符串中获取 Unicode 字符,而是获取字节数据?
First, you may want to read https://blog.golang.org/strings It will answer part of your questions.首先,您可能需要阅读https://blog.golang.org/strings它将回答您的部分问题。
A string in Go can contains arbitrary bytes. Go 中的字符串可以包含任意字节。 When you write str[i], the result is a byte, and the index is always a number of bytes.当你写 str[i] 时,结果是一个字节,索引总是字节数。
Most of the time, strings are encoded in UTF-8 though.大多数情况下,字符串都是用 UTF-8 编码的。 You have multiple ways to deal with UTF-8 encoding in a string.您有多种方法可以处理字符串中的 UTF-8 编码。
For instance, you can use the for...range statement to iterate on a string rune by rune.例如,您可以使用 for...range 语句逐个迭代字符串 rune。
var first rune
for _,c := range str {
first = c
break
}
// first now contains the first rune of the string
You can also leverage the unicode/utf8 package.您还可以利用 unicode/utf8 包。 For instance:例如:
r, size := utf8.DecodeRuneInString(str)
// r contains the first rune of the string
// size is the size of the rune in bytes
If the string is encoded in UTF-8, there is no direct way to access the nth rune of the string, because the size of the runes (in bytes) is not constant.如果字符串以 UTF-8 编码,则无法直接访问字符串的第 n 个符文,因为符文的大小(以字节为单位)不是恒定的。 If you need this feature, you can easily write your own helper function to do it (with for...range, or with the unicode/utf8 package).如果您需要此功能,您可以轻松编写自己的辅助函数来实现(使用 for...range,或使用 unicode/utf8 包)。
If you want the first rune as string
you can do如果你想要第一个符文作为string
你可以做
func firstChar(str string) string {
return strings.SplitN(str, "",2)[0]
}
But if you want it as rune
the @DidierSpezia solution is the best但是如果你想要它作为rune
,@DidierSpezia 解决方案是最好的
func firstRune(str string) (r rune) {
for _, r = range str {
return
}
return
}
You can check it in the go playground .您可以在go playground 中查看。
you can do this:你可以这样做:
func main() {
str := "cat"
var s rune
for i, c := range str {
if i == 2 {
s = c
}
}
}
s is now equal to a s 现在等于 a
You can use the utf8string
package:您可以使用utf8string
包:
package main
import "golang.org/x/exp/utf8string"
func main() {
s := utf8string.NewString("ÄÅàâäåçèéêëìîïü")
// example 1
r := s.At(1)
println(r == 'Å')
// example 2
t := s.Slice(1, 3)
println(t == "Åà")
}
https://pkg.go.dev/golang.org/x/exp/utf8string https://pkg.go.dev/golang.org/x/exp/utf8string
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.