简体   繁体   中英

Go Nonblocking multiple receive on channel

Everywhere seems to discuss that reading from a channel should always be a blocking operation. The attitude seems to be this is the Go way. This makes some sense but I'm trying to figure out how I would aggregate things from channels.

For example, sending http requests. Say I have a pipeline setup that generates streams of data, so I have a channel that produces queue/stream of points. I could then have a goroutine listen to this channel and send a HTTP Request to store it in a service. This works, but I'm creating a http request for every point.

The endpoint I'm sending it too allows me to send multiple data points in a batch. What I would like to do, is

  1. Read as many values until I would block on channel.
  2. Combine them/send single http request.
  3. Then block on channel until I can read one again.

This is how I would've done things in C, with threadsafe queues and select statements. Basically flushing the entire/queue buffer when possible. Is this a valid technique in go?

It seems the go select statement does give me something similar to C's select, but I'm still not sure if there is a 'nonblocking read' on channels.

EDIT: I'm also willing to accept what I'm intending may not be the Go Way, but constantly smashing off non stop http requests also seems wrong to me, especially if they can be aggregated. If someone has an alternative architecture that will be cool, but I want to avoid things like, magically buffering N items, or waiting X seconds until sending.

Here's how to batch until the channel is empty. The variable batch is a slice of your data point type. The variable ch is a channel of your data point type.

var batch []someType
for {
    select {
    case v := <-ch:
       batch = append(batch, v)
    default:
       if len(batch) > 0 {
           sendBatch(batch)
           batch := batch[:0]
       }
       batch = append(batch, <-ch)  // receiving a value here prevents busy waiting.
    }
}

You should prevent the batch from growing without limit. Here's a simple way to do it:

var batch []someType
for {
    select {
    case v := <-ch:
       batch = append(batch, v)
       if len(batch) >= batchLimit {
           sendBatch(batch)
           batch := batch[:0]
       }
    default:
       if len(batch) > 0 {
           sendBatch(batch)
           batch := batch[:0]
       }
       batch = append(batch, <-ch)
    }
}

Dewy Broto has given a good solution to your problem. This is a straightforward direct solution, but I wanted to comment more broadly on how you might go about finding solutions for different problems.

Go uses Communicating Sequential Process algebra (CSP) as its basis for channels, selection and lightweight processes ('goroutines'). CSP guarantees the order of events; it only introduces non-determinism when you make it do so by making a choice (aka select ). The guaranteed ordering is sometimes called "happens-before" - it makes coding so much simpler than the alternative (widely popular) non-blocking style. It also gives more scope for creating components: units of long-lived functionality that interact with the outside world through channels in a predictable way.

Perhaps talk of blocking on channels puts a mental hurdle in the way of people learning Go. We block on I/O, but we wait on channels. Waiting on channels is not to be frowned on, provided that the system as a whole has enough parallel slackness (ie other active goroutines) to get on keeping the CPU busy.

Visualising Components

So, back to your problem. Let's think about it in terms of components , you have many sources of points you need to explore. Suppose each source is a goroutine, it then forms a component in your design with an output channel. Go lets channel-ends be shared, therefore many sources can safely interleave, in order, their points onto a single channel. You don't have to do anything - it's just how channels work.

The batching function described by Dewy Broto is, at essence, another component. As a learning exercise, it's a good thing to express it this way. The batching component has one input channel of points and one output channel of batches.

Finally the HTTP i/o behaviour could also be a component with one input channel and no output channels, serving merely to receive whole batches of points then send them via HTTP.

Taking the simple case of only one source, this might be depicted like this:

+--------+     point     +---------+     batch     +-------------+
| source +------->-------+ batcher +------->-------+ http output |
+--------+               +---------+               +-------------+

The intention here is to depict the different activities at their fundamental level . It's a bit like a digital circuit diagram and that's not a coincidence.

You could indeed implement this in Go and it would work. It might even work well enough, but you may in practice prefer to optimise it by combining pairs of components, repeatedly as necessary. In this case, it's easy to combine the batcher and http output and doing so ends up with Dewy Broto's solution.

The important point is that Go concurrency happens easiest by

  • (a) do not worry up front about blocking;
  • (b) depict the activities that need to happen at a fairly fine-grained level (you can do this in your head in simple cases);
  • (c) if necessary, optimise by combining functions together.

I'll leave as a challenge the more advanced topic of visualising mobile channel ends (Pi-Calculus) where channels are used to send channel-ends to other goroutines.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM