简体   繁体   English

子串和Go垃圾收集器

[英]substrings and the Go garbage collector

When taking a substring of a string in Go, no new memory is allocated. 在Go中获取字符串的子字符串时,不会分配新的内存。 Instead, the underlying representation of the substring contains a Data pointer that is an offset of the original string's Data pointer. 相反,substring的底层表示包含一个Data指针,它是原始字符串的Data指针的偏移量。

This means that if I have a large string and wish to keep track of a small substring, the garbage collector will be unable to free any of the large string until I release all references to the shorter substring. 这意味着如果我有一个大字符串并希望跟踪一个小的子字符串,垃圾收集器将无法释放任何大字符串,直到我释放对较短子字符串的所有引用。

Slices have a similar problem, but you can get around it by making a copy of the subslice using copy(). 切片有类似的问题,但您可以通过使用copy()制作子切片的副本来绕过它。 I am unaware of any similar copy operation for strings. 我不知道字符串的任何类似的复制操作。 What is the idiomatic and fastest way to make a "copy" of a substring? 制作子串的“副本”的惯用和最快方法是什么?

For example, 例如,

package main

import (
    "fmt"
    "unsafe"
)

type String struct {
    str *byte
    len int
}

func main() {
    str := "abc"
    substr := string([]byte(str[1:]))
    fmt.Println(str, substr)
    fmt.Println(*(*String)(unsafe.Pointer(&str)), *(*String)(unsafe.Pointer(&substr)))
}

Output: 输出:

abc bc
{0x4c0640 3} {0xc21000c940 2}

I know this is an old question, but there are a couple ways you can do this without creating two copies of the data you want. 我知道这是一个老问题,但有几种方法可以做到这一点,而无需创建所需数据的两个副本。

First is to create the []byte of the substring, then simply coerce it to a string using unsafe.Pointer . 首先是创建子字符串的[]byte ,然后使用unsafe.Pointer将其强制转换为string This works because the header for a []byte is the same as that for a string , except that the []byte has an extra Cap field at the end, so it just gets truncated. 这是因为[]byte的标题与string的标题相同,除了[]byte末尾有一个额外的Cap字段,因此它只是被截断。

package main

import (
    "fmt"
    "unsafe"
)

func main() {
    str := "foobar"
    byt := []byte(str[3:])
    sub := *(*string)(unsafe.Pointer(&byt))
    fmt.Println(str, sub)
}

The second way is to use reflect.StringHeader and reflect.SliceHeader to do a more explicit header transfer. 第二种方法是使用reflect.StringHeaderreflect.SliceHeader进行更明确的标头传输。

package main

import (
    "fmt"
    "unsafe"
    "reflect"
)

func main() {
    str := "foobar"
    byt := []byte(str[3:])
    bytPtr := (*reflect.SliceHeader)(unsafe.Pointer(&byt)).Data
    strHdr := reflect.StringHeader{Data: bytPtr, Len: len(byt)}
    sub := *(*string)(unsafe.Pointer(&strHdr))
    fmt.Println(str, sub)
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM