简体   繁体   English

如何将字节缓冲区中以 null 结尾的字符串转换为 Go 中的字符串?

[英]How can I convert a null-terminated string in a byte buffer to a string in Go?

This:这个:

label := string([]byte{97, 98, 99, 0, 0, 0, 0})
fmt.Printf("%s\n", label)

does this ( ^@ is the null-byte):这样做( ^@是空字节):

go run test.go 
abc^@^@^@

Note that the first answer will only work with strings that have only a run of zeroes after the null terminator; 请注意,第一个答案仅适用于在null终止符后仅运行零的字符串; however, a proper C-style null-terminated string ends at the first \\0 even if it's followed by garbage. 但是,一个正确的C风格的以null结尾的字符串在第一个\\0结尾,即使它后面跟着垃圾。 For example, []byte{97,98,99,0,99,99,0} should be parsed as abc , not abc^@cc . 例如, []byte{97,98,99,0,99,99,0}应解析为abc ,而不是abc^@cc

To properly parse this, use string.Index , as follows, to find the first \\0 and use it to slice the original byte-slice: 要正确解析它,请使用string.Index ,如下所示,找到第一个 \\0并使用它来切片原始字节切片:

package main

import (
    "fmt"
    "strings"
)

func main() {
    label := []byte{97,98,99,0,99,99,0}
    nullIndex := strings.Index(string(label), "\x00")
    if (nullIndex < 0) {
        fmt.Println("Buffer did not hold a null-terminated string")
        os.Exit(1)
    }
    fmt.Println(string(label[:nullIndex]))
}

EDIT: Was printing the shortened version as a []byte instead of as a string . 编辑:将缩短版本打印为[]byte而不是string Thanks to @serbaut for the catch. 感谢@serbaut的捕获。

EDIT 2: Was not handling the error case of a buffer without a null terminator. 编辑2:没有处理没有空终止符的缓冲区的错误情况。 Thanks to @snap for the catch. 感谢@snap的捕获。

There's this function hidden inside Go's syscall package that finds the first null byte ([]byte{0}) and returns the length. 在Go的syscall包中隐藏了这个函数,它找到第一个空字节([] byte {0})并返回长度。 I'm assuming it's called clen for C-Length. 我假设它被称为C-Length的clen。

Sorry I'm a year late on this answer, but I think it's a lot simpler than the other two (no unnecessary imports, etc.) 对不起,我对这个答案迟了一年,但我觉得它比其他两个简单得多(没有不必要的进口等)

func clen(n []byte) int {
    for i := 0; i < len(n); i++ {
        if n[i] == 0 {
            return i
        }
    }
    return len(n)
}

So, 所以,

label := []byte{97, 98, 99, 0, 0, 0, 0}
s := label[:clen(label)]
fmt.Println(string(s))

What that ^ says is to set s to the slice of bytes in label from the beginning to the index of clen(label) . ^所说的是将s设置为从开头到clen(label)索引的label的字节切片。

The result would be abc with a length of 3. 结果将是abc ,长度为3。

use the strings package. 使用strings包。

package main

import (
    "fmt"
    "strings"
)

func main() {
    label := string([]byte{97, 98, 99, 0, 0, 0, 0})
    fmt.Println(strings.TrimSpace(label))
}

You can use the sys package:您可以使用sys package:

package main
import "golang.org/x/sys/windows"

func main() {
   b := []byte{97, 98, 99, 0, 0, 0, 0}
   s := windows.ByteSliceToString(b)
   println(s == "abc")
}

Or you can just implement it yourself:或者你可以自己实现它:

package main
import "bytes"

func byteSliceToString(s []byte) string {
   n := bytes.IndexByte(s, 0)
   if n >= 0 {
      s = s[:n]
   }
   return string(s)
}

func main() {
   b := []byte{97, 98, 99, 0, 0, 0, 0}
   s := byteSliceToString(b)
   println(s == "abc")
}

In go 1.18+, you can use bytes.Cut :在 go 1.18+ 中,您可以使用bytes.Cut

import (
    "bytes"
)

func bytesToStr(in []byte) string {
    str, _, _ := bytes.Cut(in, []byte{0})
    return string(str)
}

一.一. strings字符串.TrimSpace .TrimSpace .TrimRight .TrimRight

//trim tail '\0', but can't handle bytes like "abc\x00def\x00". //修剪尾部'\0',但不能处理像“abc\x00def\x00”这样的字节。

can't edit @orelli answer, so wrote here:无法编辑@orelli 答案,所以在这里写道:

package main

import (
    "fmt"
    "strings"
)

func main() {
    label := string([]byte{97, 98, 99, 0, 0, 0, 0})

    s1 := strings.TrimSpace(label)
    fmt.Println(len(s1), s1)

    s2 := strings.TrimRight(label, "\x00")
    fmt.Println(len(s2), s2)
  }

output: output:

7 abc????
3 abc

//? //? is '\0' which can't display here.是 '\0' ,此处无法显示。


So所以
.TrimSpace can't trim '\0', but .TrimSpace不能修剪 '\0',但是
.TrimRight with "\x00" can. .TrimRight加上 "\x00" 即可。



二.二. bytes.IndexByte bytes.IndexByte

search for first '\0', maybe not support utf-8搜索第一个 '\0',可能不支持 utf-8

package main

import (
    "bytes"
    "fmt"
    "strings"
)

func main() {
    b_arr := []byte{97, 98, 99, 0, 100, 0, 0}
    label := string(b_arr)

    s1 := strings.TrimSpace(label)
    fmt.Println(len(s1), s1)   //7 abc?d??

    s2 := strings.TrimRight(label, "\x00")
    fmt.Println(len(s2), s2)   //5 abc?d

    n := bytes.IndexByte([]byte(label), 0)
    fmt.Println(n, label[:n])  //3 abc

    s_arr := b_arr[:bytes.IndexByte(b_arr, 0)]
    fmt.Println(len(s_arr), string(s_arr)) //3 abc
}

equivalent相等的

n1 := bytes.IndexByte(b_arr, 0)
n2 := bytes.Index(b_arr, []byte{0})

n3, c := 0, byte(0)
for n3, c = range b_arr {
    if c == 0 {
        break
    }
}

The first answer will not work!! 第一个答案是行不通的!!

func TrimSpace(s []byte) []byte {
    return TrimFunc(s, unicode.IsSpace)
}

func IsSpace(r rune) bool {
    // This property isn't the same as Z; special-case it.
    if uint32(r) <= MaxLatin1 {
        switch r {
        case '\t', '\n', '\v', '\f', '\r', ' ', 0x85, 0xA0:
            return true
        }
        return false
    }
    return isExcludingLatin(White_Space, r)
}

there is not "\\x00" in func IsSpace at all. func IsSpace中根本没有“\\ x00”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM