简体   繁体   English

如何使用 goroutine 池

[英]How to use a goroutine pool

I want to use Go for downloading stock price spreadsheets from Yahoo finance.我想使用 Go 从雅虎财经下载股票价格电子表格。 I'll be making an http request for every stock in its own goroutine.我将在自己的 goroutine 中为每只股票发出 http 请求。 I have a list of around 2500 symbols, but instead of making 2500 requests in parallel, I'd prefer making 250 at a time.我有一个大约 2500 个符号的列表,但与其并行发出 2500 个请求,我更喜欢一次发出 250 个请求。 In Java I'd create a thread pool and reuse threads as and when they get free.在 Java 中,我会创建一个线程池并在线程空闲时重用它们。 I was trying to find something similar, a goroutine pool, if you will, but was unable to find any resources.我试图找到类似的东西,一个 goroutine 池,如果你愿意的话,但找不到任何资源。 I'd appreciate if someone can tell me how to accomplish the task at hand or point me to resources for the same.如果有人能告诉我如何完成手头的任务或为我指出相同的资源,我将不胜感激。 Thanks!谢谢!

The simplest way, I suppose, is to create 250 goroutines and pass them a channel which you can use to pass links from main goroutine to child ones, listening that channel.我想,最简单的方法是创建 250 个 goroutine 并传递给它们一个通道,您可以使用该通道将链接从主 goroutine 传递到子 goroutine,并监听该通道。

When all links are passed to goroutines, you close a channel and all goroutines just finish their jobs.当所有链接都传递给 goroutine 时,您关闭一个通道,所有 goroutine 就完成了它们的工作。

To secure yourself from main goroutine get finished before children process data, you can use sync.WaitGroup .为了在孩子处理数据之前完成主 goroutine 的安全,您可以使用sync.WaitGroup

Here is some code to illustrate (not a final working version but shows the point) that I told above:下面是一些代码来说明我上面所说的(不是最终的工作版本,而是说明了这一点):

func worker(linkChan chan string, wg *sync.WaitGroup) {
   // Decreasing internal counter for wait-group as soon as goroutine finishes
   defer wg.Done()

   for url := range linkChan {
     // Analyze value and do the job here
   }
}

func main() {
    lCh := make(chan string)
    wg := new(sync.WaitGroup)

    // Adding routines to workgroup and running then
    for i := 0; i < 250; i++ {
        wg.Add(1)
        go worker(lCh, wg)
    }

    // Processing all links by spreading them to `free` goroutines
    for _, link := range yourLinksSlice {
        lCh <- link
    }

    // Closing channel (waiting in goroutines won't continue any more)
    close(lCh)

    // Waiting for all goroutines to finish (otherwise they die as main routine dies)
    wg.Wait()
}

You can use the thread pool implementation library in Go from this git repo你可以在这个git repo中使用Go的线程池实现库

Here is the nice blog about how to use the channels as thread pool 是关于如何使用通道作为线程池的好博客

Snippet from the blog来自博客的片段

    var (
 MaxWorker = os.Getenv("MAX_WORKERS")
 MaxQueue  = os.Getenv("MAX_QUEUE")
)

//Job represents the job to be run
type Job struct {
    Payload Payload
}

// A buffered channel that we can send work requests on.
var JobQueue chan Job

// Worker represents the worker that executes the job
type Worker struct {
    WorkerPool  chan chan Job
    JobChannel  chan Job
    quit        chan bool
}

func NewWorker(workerPool chan chan Job) Worker {
    return Worker{
        WorkerPool: workerPool,
        JobChannel: make(chan Job),
        quit:       make(chan bool)}
}

// Start method starts the run loop for the worker, listening for a quit channel in
// case we need to stop it
func (w Worker) Start() {
    go func() {
        for {
            // register the current worker into the worker queue.
            w.WorkerPool <- w.JobChannel

            select {
            case job := <-w.JobChannel:
                // we have received a work request.
                if err := job.Payload.UploadToS3(); err != nil {
                    log.Errorf("Error uploading to S3: %s", err.Error())
                }

            case <-w.quit:
                // we have received a signal to stop
                return
            }
        }
    }()
}

// Stop signals the worker to stop listening for work requests.
func (w Worker) Stop() {
    go func() {
        w.quit <- true
    }()
} 

This example uses two chanels, one for the inputs and another for output.此示例使用两个通道,一个用于输入,另一个用于输出。 Workers can scale to whatever size and each goroutine works on the input queue and saves all output to the output channel. Worker 可以扩展到任何大小,每个 goroutine 在输入队列上工作并将所有输出保存到输出通道。 Feedback on easier methods are very welcome.非常欢迎对更简单方法的反馈。

package main

import (
    "fmt"
    "sync"
)

var wg sync.WaitGroup

func worker(input chan string, output chan string) {
    defer wg.Done()
    // Consumer: Process items from the input channel and send results to output channel
    for value := range input {
        output <- value + " processed"
    }
}

func main() {
    var jobs = []string{"one", "two", "three", "four", "two", "three", "four", "two", "three", "four", "two", "three", "four", "two", "three", "four", "two"}
    input := make(chan string, len(jobs))
    output := make(chan string, len(jobs))
    workers := 250

    // Increment waitgroup counter and create go routines
    for i := 0; i < workers; i++ {
        wg.Add(1)
        go worker(input, output)
    }

    // Producer: load up input channel with jobs
    for _, job := range jobs {
        input <- job
    }

    // Close input channel since no more jobs are being sent to input channel
    close(input)
    // Wait for all goroutines to finish processing
    wg.Wait()
    // Close output channel since all workers have finished processing
    close(output)

    // Read from output channel
    for result := range output {
        fmt.Println(result)
    }

}

You can take a look at this你可以看看这个

We have created a thread pool in go and have been using it for our production systems.我们在 go 中创建了一个线程池,并将其用于我们的生产系统。

I had taken reference from here我从这里参考

Its pretty simple to use and also has a prometheus client that tells you how many workers are used.它使用起来非常简单,并且还有一个 prometheus 客户端,可以告诉您使用了多少工人。

To initialize just create an instance of dispatcher要初始化,只需创建一个调度程序实例

dispatcher = workerpool.NewDispatcher(
    "DispatcherName",
    workerpool.SetMaxWorkers(10),
)

Create an object (lets say job ) that implements this interface.创建一个实现此接口的对象(可以说是job )。 So it should implement the Process method所以它应该实现 Process 方法

// IJob : Interface for the Job to be processed
type IJob interface {
    Process() error
}

Then just send the job to the dispatcher然后只需将作业发送给调度员

dispatcher.JobQueue <- job //object of job

This is it.就是这个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM