简体   繁体   English

WaitGroup.Wait() 超时

[英]Timeout for WaitGroup.Wait()

What is an idiomatic way to assign a timeout to WaitGroup.Wait() ?将超时分配给WaitGroup.Wait()的惯用方法是什么?

The reason I want to do this, is to safeguard my 'scheduler' from potentially awaiting an errant 'worker' for ever.我想这样做的原因是为了保护我的“调度程序”永远不会等待一个错误的“工人”。 This leads to some philosophical questions (ie how can the system reliably continue once it has errant workers?), but I think that's out of scope for this question.这导致了一些哲学问题(即,一旦有错误的工人,系统如何可靠地继续运行?),但我认为这个问题超出了 scope 的范围。

I have an answer which I'll provide.我有一个答案,我会提供。 Now that I've written it down, it doesn't seem so bad but it still feels more convoluted than it ought to.现在我已经把它写下来了,它看起来并没有那么糟糕,但它仍然感觉比它应该的更复杂。 I'd like to know if there's something available which is simpler, more idiomatic, or even an alternative approach which doesn't use WaitGroups.我想知道是否有更简单、更惯用的方法,甚至是不使用 WaitGroups 的替代方法。

Ta.塔。

Mostly your solution you posted below is as good as it can get.大多数情况下,您在下面发布的解决方案都尽可能好。 Couple of tips to improve it:改进它的几个技巧:

  • Alternatively you may close the channel to signal completion instead of sending a value on it, a receive operation on a closed channel can always proceed immediately .或者,您可以关闭通道以发出完成信号,而不是在其上发送值,关闭通道上的接收操作始终可以立即进行
  • And it's better to use defer statement to signal completion, it is executed even if a function terminates abruptly.最好使用defer语句来表示完成,即使函数突然终止,它也会执行。
  • Also if there is only one "job" to wait for, you can completely omit the WaitGroup and just send a value or close the channel when job is complete (the same channel you use in your select statement).此外,如果只有一个“作业”要等待,您可以完全省略WaitGroup并在作业完成时发送一个值或关闭通道(与您在select语句中使用的通道相同)。
  • Specifying 1 second duration is as simple as: timeout := time.Second .指定 1 秒持续时间很简单: timeout := time.Second Specifying 2 seconds for example is: timeout := 2 * time.Second .例如,指定 2 秒是: timeout := 2 * time.Second You don't need the conversion, time.Second is already of type time.Duration , multiplying it with an untyped constant like 2 will also yield a value of type time.Duration .您不需要转换, time.Second已经是time.Duration类型,将它与一个无类型常量(如2相乘也会产生time.Duration类型的值。

I would also create a helper / utility function wrapping this functionality.我还将创建一个包装此功能的助手/实用程序函数。 Note that WaitGroup must be passed as a pointer else the copy will not get "notified" of the WaitGroup.Done() calls.请注意, WaitGroup必须作为指针传递,否则副本将不会收到WaitGroup.Done()调用的“通知”。 Something like:就像是:

// waitTimeout waits for the waitgroup for the specified max timeout.
// Returns true if waiting timed out.
func waitTimeout(wg *sync.WaitGroup, timeout time.Duration) bool {
    c := make(chan struct{})
    go func() {
        defer close(c)
        wg.Wait()
    }()
    select {
    case <-c:
        return false // completed normally
    case <-time.After(timeout):
        return true // timed out
    }
}

Using it:使用它:

if waitTimeout(&wg, time.Second) {
    fmt.Println("Timed out waiting for wait group")
} else {
    fmt.Println("Wait group finished")
}

Try it on the Go Playground .Go Playground上试一试。

I did it like this: http://play.golang.org/p/eWv0fRlLEC我是这样做的: http : //play.golang.org/p/eWv0fRlLEC

go func() {
    wg.Wait()
    c <- struct{}{}
}()
timeout := time.Duration(1) * time.Second
fmt.Printf("Wait for waitgroup (up to %s)\n", timeout)
select {
case <-c:
    fmt.Printf("Wait group finished\n")
case <-time.After(timeout):
    fmt.Printf("Timed out waiting for wait group\n")
}
fmt.Printf("Free at last\n")

It works fine, but is it the best way to do it ?它工作正常,但这是最好的方法吗?

Most existing answers suggest leaking goroutines.大多数现有答案都表明存在泄漏 goroutines。 The idiomatic way to assign a timeout to WaitGroup.Wait is to use underlying sync/atomic package primitives.WaitGroup.Wait分配超时的惯用方法是使用底层同步/原子包原语。 I took code from @icza answer and rewrote it using the atomic package, and added context cancelation as that's an idiomatic way to notify of a timeout.我从@icza 答案中获取代码并使用atomic包重写它,并添加了上下文取消,因为这是通知超时的惯用方式。

package main

import (
    "context"
    "fmt"
    "sync/atomic"
    "time"
)

func main() {
    var submitCount int32
    // run this instead of wg.Add(1)
    atomic.AddInt32(&submitCount, 1)

    // run this instead of wg.Done()
    // atomic.AddInt32(&submitCount, -1)

    timeout := time.Second
    ctx, cancel := context.WithTimeout(context.Background(), timeout)
    defer cancel()
    fmt.Printf("Wait for waitgroup (up to %s)\n", timeout)

    waitWithCtx(ctx, &submitCount)

    fmt.Println("Free at last")
}

// waitWithCtx returns when passed counter drops to zero
// or when context is cancelled
func waitWithCtx(ctx context.Context, counter *int32) {
    ticker := time.NewTicker(10 * time.Millisecond)
    for {
        select {
        case <-ctx.Done():
            return
        case <-ticker.C:
            if atomic.LoadInt32(counter) == 0 {
                return
            }
        }
    }
}

Same code in Go Playground Go Playground 中的相同代码

This is a bad idea.这是一个坏主意。 Do not abandon goroutines , doing so may introduce races, resource leaks and unexpected conditions, ultimately impacting the stability of your application.不要放弃 goroutines ,这样做可能会引入竞争、资源泄漏和意外情况,最终影响应用程序的稳定性。

Instead use timeouts throughout your code consistently in order to make sure no goroutine is blocked forever or takes too long to run.相反,在整个代码中始终使用超时,以确保没有 goroutine 被永远阻塞或运行时间过长。

The idiomatic way for achieving that is via context.WithTimeout() :实现这一点的惯用方法是通过context.WithTimeout()

ctx, cancel := context.WithTimeout(context.Background(), 5 * time.Second)
defer cancel()

// Now perform any I/O using the given ctx:
go func() {
  err = example.Connect(ctx)
  if err != nil { /* handle err and exit goroutine */ }
  . . .
}()

Now you can safely use WaitGroup.Wait() , knowing it will always finish in a timely manner.现在您可以安全地使用WaitGroup.Wait() ,知道它总是会及时完成。

This is not an actual answer to this question but was the (much simpler) solution to my little problem when I had this question.这不是这个问题的实际答案,而是当我遇到这个问题时对我的小问题的(更简单的)解决方案。

My 'workers' were doing http.Get() requests so I just set the timeout on the http client.我的“工人”正在执行 http.Get() 请求,所以我只是在 http 客户端上设置超时。

urls := []string{"http://1.jpg", "http://2.jpg"}
wg := &sync.WaitGroup{}
for _, url := range urls {
    wg.Add(1)
    go func(url string) {
        client := http.Client{
            Timeout: time.Duration(3 * time.Second), // only want very fast responses
        }
        resp, err := client.Get(url)
        //... check for errors
        //... do something with the image when there are no errors
        //...

        wg.Done()
    }(url)

}
wg.Wait()

The following will not introduce any leaking goroutines以下不会介绍任何泄漏的 goroutine

func callingFunc() {
    ...
    wg := new(sync.WaitGroup)
    for _, msg := range msgs {
        wg.Add(1)
        go wrapperParallelCall(ctx, params, wg)
    }

    wg.Wait()
}

func wrapperParallelCall(ctx, params, wg) {
    ctx, cancel := context.WithTimeout(ctx, time.Second)
    defer wg.Done()
    defer cancel()

    originalSequenceCall(ctx, params)
}

func originalSequenceCall(ctx, params) {...}

I wrote a library that encapsulates the concurrency logic https://github.com/shomali11/parallelizer which you can also pass a timeout.我写了一个封装并发逻辑的库https://github.com/shomali11/parallelizer你也可以传递超时。

Here is an example without a timeout:这是一个没有超时的示例:

func main() {
    group := parallelizer.DefaultGroup()

    group.Add(func() {
        for char := 'a'; char < 'a'+3; char++ {
            fmt.Printf("%c ", char)
        }
    })

    group.Add(func() {
        for number := 1; number < 4; number++ {
            fmt.Printf("%d ", number)
        }
    })

    err := group.Run()

    fmt.Println()
    fmt.Println("Done")
    fmt.Printf("Error: %v", err)
}

Output:输出:

a 1 b 2 c 3 
Done
Error: <nil>

Here is an example with a timeout:下面是一个超时的例子:

func main() {
    options := &parallelizer.Options{Timeout: time.Second}
    group := parallelizer.NewGroup(options)

    group.Add(func() {
        time.Sleep(time.Minute)

        for char := 'a'; char < 'a'+3; char++ {
            fmt.Printf("%c ", char)
        }
    })

    group.Add(func() {
        time.Sleep(time.Minute)

        for number := 1; number < 4; number++ {
            fmt.Printf("%d ", number)
        }
    })

    err := group.Run()

    fmt.Println()
    fmt.Println("Done")
    fmt.Printf("Error: %v", err)
}

Output:输出:

Done
Error: timeout

Another solution without leaking wg.Wait() routine: just use (well-suported and widely-used) golang.org/x/sync/semaphore :另一个不泄漏wg.Wait()例程的解决方案:只需使用(得到良好支持和广泛使用) golang.org/x/sync/semaphore

  • Instead of sync.WaitGroup{} use sem.NewWeighted(N) (you have to know N in advance)使用sem.NewWeighted(N)代替sync.WaitGroup{} (你必须提前知道N
  • Instead of wg.Add(1) use err := sem.Acquire(ctx, 1)而不是wg.Add(1)使用err := sem.Acquire(ctx, 1)
  • Instead of defer wg.Done() use defer sem.Release(1)取而代之的defer wg.Done()使用defer sem.Release(1)
  • Instead of wg.Wait() you can use sem.Acquire(ctx, N) with context with timeout.您可以将sem.Acquire(ctx, N)与带超时的上下文一起使用,而不是wg.Wait()
  • Watch out, this is only equivalent to sync.WaitGroup in this specific use-case (when you only call Add(1) and Release(1) N times).请注意,在此特定用例中,这仅等效于sync.WaitGroup (当您仅调用Add(1)Release(1) N次时)。 Read the documentation carefully.仔细阅读文档。

Example :示例

package main

import (
    "context"
    "log"
    "time"

    "golang.org/x/sync/semaphore"
)

func worker(n int) {
    time.Sleep(time.Duration(n) * time.Second)
    log.Printf("Worker %v finished", n)
}

func main() {

    const N = 5
    sem := semaphore.NewWeighted(N)

    for i := 0; i < N; i++ {

        err := sem.Acquire(context.Background(), 1)
        if err != nil {
            log.Fatal("sem.Acquire err", err)
        }
        go func(n int) {
            defer sem.Release(1)
            worker(n)
        }(i)
    }

    ctx, cancel := context.WithTimeout(context.Background(), time.Second*2)
    defer cancel()

    err := sem.Acquire(ctx, N)
    if err != nil {
        log.Println("sem.Acquire err:", err)
        return
    }

    log.Println("sem.Acquire ok")
}

Which results in:结果是:

2009/11/10 23:00:00 Worker 0 finished
2009/11/10 23:00:01 Worker 1 finished
2009/11/10 23:00:02 Worker 2 finished
2009/11/10 23:00:02 sem.Acquire err: context deadline exceeded

Throwing in a solution which does not leak a goroutine, or rely on polling (sleeps):抛出一个不会泄漏goroutine或依赖轮询(睡眠)的解决方案:

import "atomic"

type WaitGroup struct {
    count int32
    done chan struct{}
}

func NewWaitGroup() *WaitGroup {
    return &WaitGroup{
        done: make(chan struct{}),
    }
}

func (wg *WaitGroup) Add(i int32) {
    select {
    case <-wg.done:
        panic("use of an already closed WaitGroup")
    default:
    }
    atomic.AddInt32(&wg.count, i)
}

func (wg *WaitGroup) Done() {
    i := atomic.AddInt32(&wg.count, -1)
    if i == 0 {
        close(wg.done)
    }
    if i < 0 {
        panic("too many Done() calls")
    }
}

func (wg *WaitGroup) C() <-chan struct{} {
    return wg.done
}

Usage:用法:

wg := NewWaitGroup()
wg.Add(1)
go func() {
  // do stuff
  wg.Done()
}

select {
case <-wg.C():
  fmt.Printf("Completed!\n")
case <-time.NewTimer(time.Second):
  fmt.Printf("Timed out!\n")
}

We had the same need for one of our systems.我们对我们的一个系统也有同样的需求。 by passing a context to goroutines and closing that context when we are facing timeout, we would prevent goroutine leaks.通过将上下文传递给 goroutines 并在我们面临超时时关闭该上下文,我们将防止 goroutine 泄漏。

func main() {
    ctx := context.Background()
    ctxWithCancel, cancelFunc := context.WithCancel(ctx)
    var wg sync.WaitGroup
    Provide(ctxWithCancel, 5, &wg)
    Provide(ctxWithCancel, 5, &wg)
    c := make(chan struct{})
    go func() {
        wg.Wait()
        c <- struct{}{}
        fmt.Println("closed")
    }()

    select {
    case <-c:
    case <-time.After(20 * time.Millisecond):
        cancelFunc()
        fmt.Println("timeout")
    }
}

func Work(ctx context.Context, to int) {
    for i := 0; i < to; i++ {
        select {
        case <-ctx.Done():
            return
        default:
            fmt.Println(i)
            time.Sleep(10 * time.Millisecond)
        }
    }
}

func Provide(ctx context.Context, to int, wg *sync.WaitGroup) {
    wg.Add(1)
    go func() {
        Work(ctx, to)
        wg.Done()
    }()
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM