简体   繁体   中英

Golang: How to correctly parse UTF-8 string from C

I'm a newbie to the go world, so maybe this is obvious.

I have a Go function which I'm exposing to C with the go build -buildmode=c-shared and corresponding //export funcName comment. (You can see it here: https://github.com/udl/bmatch/blob/master/ext/levenshtein.go#L42 )

My conversion currently works like this:

func distance(s1in, s2in *C.char) int {
    s1 := C.GoString(s1in)
    s2 := C.GoString(s2in)

How would I handle UTF-8 input here? I've seen there is a UTF-8 package but I don't quite get how it works. https://golang.org/pkg/unicode/utf8/

Thank you!

You don't need to do anything special. UTF-8 is Go's "native" character encoding, so you can use the functions from the utf8 package you mentioned, eg utf8.RuneCountInString to get the number of Unicode runes in a string. Keep in mind that len(s) will still return the number of bytes in the string.

See this post in the official blog or this article for some details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM