I'm trying to implement a proxy pattern to chain transformations on io.Reader, in order to handle chunk of bytes efficiently.
We cannot use pointers on receivers, so my solution seem not very efficient
The code below say "process take too long"
Complete example at: https://play.golang.org/p/KhM0VXLq4CO
b := bytes.NewBufferString(text)
t := transformReaderHandler(*b)
readByChunk(t)
type transformReaderHandler bytes.Buffer
func (t transformReaderHandler) Read(p []byte) (n int, err error) {
n, err = (*bytes.Buffer)(&t).Read(p)
//if n > 0 {
// Do Something on the chunk
//}
return
}
Do you have any more efficient (memory efficient, computationally efficient) solution ?
Why do this code is not working ?
EDIT: The implementation of @svsd solution : https://play.golang.org/p/VUpJcyKLB6D
package main
import (
"io"
"fmt"
"bytes"
)
const text = "Reaaaaally long and complex text to read in chunk"
func main() {
b := bytes.NewBufferString(text)
t := (*transformReaderHandler)(b)
readByChunk(t)
}
type transformReaderHandler bytes.Buffer
func (t *transformReaderHandler) Read(p []byte) (n int, err error) {
n, err = (*bytes.Buffer)(t).Read(p)
if n > 0 {
p[0] = 'X'
}
return
}
func readByChunk(r io.Reader) {
var p = make([]byte, 4)
for {
n, err := r.Read(p)
if err == io.EOF {
break
}
fmt.Println(string(p[:n]))
}
}
You're copying the bytes.Buffer
value each time Read
is called on the transformReaderHandler
, so you can never progress through the buffer. You must used a *bytes.Buffer
pointer to avoid this copy.
Embed the buffer (or alternatively add it as a named field) in your transformReaderHandler
, so you can call delegate the Read
method as needed.
type transformReaderHandler struct {
*bytes.Buffer
}
func (t *transformReaderHandler) Read(p []byte) (n int, err error) {
n, err = t.Buffer.Read(p)
//if n > 0 {
// Do Something
//}
return
}
The code below say "process take too long"
Why do this code is not working ?
In the transformReaderHandler.Read()
method, you have a value receiver. That means each time Read()
is called, it gets a copy of the instance on which it was called. Then when you then call (*bytes.Buffer)(&t).Read(p)
, it modifies the internal state of that instance so that next time when you read, it reads from after the point it read earlier.
Now because the instance is a copy, it is discarded after the method exits and the original instance remains unchanged. Hence, each time you call Read()
, bytes.Buffer.Read()
reads only the first few bytes. To prove this, add a statement fmt.Println("n=", n, "err=", err)
inside readByChunk()
after calling Read()
.
To quickly check that this is indeed due to the value receiver, you can define transformReaderHandler.Read()
with a pointer receiver and store t
as t = (*transformReaderHandler)(b)
. I'll let you examine what it does. (edit: the correct solution involving embedding is in the comments)
Do you have any more efficient (memory efficient, computationally efficient) solution ?
If you're only looking for buffered IO for more efficient reads, look at the bufio.NewReader()
. If that's not sufficient, you can take inspiration from it and wrap around an io.Reader
interface instead of wrapping over a bytes.Buffer
instance.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.