简体   繁体   中英

coordination and graceful closing of go-routines

I am totally new in Go just learning (at the moment I am playing with channels, and go routines, I am trying to understand the coordination). I got stuck with a problem.

I do not understand what is happening behind the curtains, when I start 2 go routines. ServiceAlpha's responsibility to give heartbeat tick to ServiceBeta and die if it gets abort event on abort channel ServiceBeta's responsibility to write some stupid string to my log output and it has a TTL (it comes from serviceAlpha), so if it does not get heartbeat in the TTL time it should finish its job and return to his caller, but before exit it should give some abort event to ServiceAlpha on channel abort

My problem I do not understand why it keeps running, after the TTL and

an output looks like this:

2019/04/28 17:21:44 Main | START
2019/04/28 17:21:44 serviceBeta | START
2019/04/28 17:21:44 serviceAlpha | START
2019/04/28 17:21:44 serviceAlpha | There was not abort
2019/04/28 17:21:44 serviceAlpha | rand = 3
2019/04/28 17:21:45 serviceBeta | TICKER:  2019-04-28 17:21:45.2870594 +0200 CEST m=+1.003883901
2019/04/28 17:21:46 serviceBeta | TICKER:  2019-04-28 17:21:46.2867957 +0200 CEST m=+2.003620201
2019/04/28 17:21:47 serviceBeta | TICKER:  2019-04-28 17:21:47.2866242 +0200 CEST m=+3.003448701
2019/04/28 17:21:47 serviceAlpha | There was not abort
2019/04/28 17:21:47 serviceAlpha | rand = 5
2019/04/28 17:21:48 serviceBeta | TICKER:  2019-04-28 17:21:48.2869918 +0200 CEST m=+4.003816301
2019/04/28 17:21:49 serviceBeta | TICKER:  2019-04-28 17:21:49.2863265 +0200 CEST m=+5.003151001
2019/04/28 17:21:50 serviceBeta | TICKER:  2019-04-28 17:21:50.2868071 +0200 CEST m=+6.003631601
2019/04/28 17:21:51 serviceBeta | TICKER:  2019-04-28 17:21:51.2866738 +0200 CEST m=+7.003498301
2019/04/28 17:21:51 serviceBeta | ABORT 2019-04-28 17:21:51.2866738 +0200 CEST m=+7.003498301

Keeps running...
missing serviceBeta: STOP
and ServiceAlpha,and Main also

I am pretty sure there are lots of problem with this code, so my questions are - What is the problem in this design? :) - How should it be coded in the right way?

Every help would be appreciated!

Darvi

package main

import (
    "log"
    rand2 "math/rand"
    "sync"
    "time"
)

func main() {
    log.Println("Main | START")
    var wg sync.WaitGroup
    var resetTTL = make(chan interface{})
    var abort = make(chan interface{})
    defer func() {
        log.Println("Main | defer closing channels")
        close(resetTTL)
        close(abort)
    }()

    wg.Add(1)
    go serviceAlpha(1, 5, resetTTL, abort, &wg)

    wg.Add(1)
    go serviceBeta(4*time.Second, resetTTL, abort, &wg)

    wg.Wait()
    log.Println("Main | STOP")
}

func serviceAlpha(min, max int, ttlReset chan<- interface{}, abort <-chan interface{}, wg *sync.WaitGroup) {
    log.Println("serviceAlpha | START")
    var randTTL int
loop:
    for {
        select {
        case <-abort:
            log.Println("serviceAlpha | There was an abort, breaking from loop")
            break loop
        default:
            log.Println("serviceAlpha | There was not abort")
            break
        }

        randTTL = rand2.Intn(max-min) + min + 1
        log.Printf("serviceAlpha | rand = %v", randTTL)
        time.Sleep(time.Duration(randTTL) * time.Second)
        ttlReset <- true
    }

    log.Println("serviceAlpha | STOP")
    wg.Done()
}

func serviceBeta(ttl time.Duration, ttlReset <-chan interface{}, abort chan<- interface{}, wg *sync.WaitGroup) {
    log.Println("serviceBeta | START")
    var ttlTimer = time.NewTimer(ttl)
    var tickerTimer = time.NewTicker(1 * time.Second)

loop:
    for {
        select {
        case <-ttlReset:
            ttlTimer.Stop()
            ttlTimer.Reset(ttl)
        case tt := <-tickerTimer.C:
            log.Println("serviceBeta | TICKER: ", tt)
        case ttl := <-ttlTimer.C:
            log.Println("serviceBeta | ABORT", ttl)
            break loop
        }
    }

    abort <- true

    log.Println("serviceBeta | STOP")
    wg.Done()
}

I have just figured it out! :) Rubber_duck_debugging

abort needs capacity 1 ;) and checking abort should be in same time when trying to reset the TTL

var abort = make(chan interface{}, 1)
func serviceAlpha(min, max int, ttlReset chan<- interface{}, abort <-chan interface{}, wg *sync.WaitGroup) {
    log.Println("serviceAlpha | START")
    var randTTL int
loop:
    for {
        randTTL = rand2.Intn(max-min) + min + 1
        log.Printf("serviceAlpha | rand = %v", randTTL)
        time.Sleep(time.Duration(randTTL) * time.Second)

        select {
        case <-abort:
            log.Println("serviceAlpha | There was an abort, breaking from loop")
            break loop
        default:
            log.Println("serviceAlpha | There was not any abort event")
            ttlReset <- true
            break
        }
    }

    log.Println("serviceAlpha | STOP")
    wg.Done()
}

While your solution may work for this case, it is still a design flaw. The problem in your original code was an effective deadlock where serviceAlpha was trying to write to ttlReset while serviceBeta was no longer listening to that channel. That blocks serviceAlpha . Now, serviceBeta wants to write to abort but serviceAplha is not currently listening to it, which blocks serviceBeta as well. Using a buffered channel works because it is no longer blocking.

However, I was able to use ticker instead of sleep and make it work. Here is the modified code for serviceAlpha :

   func serviceAlpha(min, max int, ttlReset chan<- interface{}, abort <-chan interface{}, wg *sync.WaitGroup) {
    log.Println("serviceAlpha | START")
    var randTTL int
loop:
    for {
        randTTL = rand2.Intn(max-min) + min + 1
        log.Printf("serviceAlpha | rand = %v", randTTL)
        ticker := time.NewTicker(time.Duration(randTTL) * time.Second)
        select {
        case <-abort:
            log.Println("serviceAlpha | There was an abort, breaking from loop")
            break loop
        case <-ticker.C:
            log.Println("serviceAlpha | There was not abort")
            ttlReset <- true
        }
    }

    log.Println("serviceAlpha | STOP")
    wg.Done()
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM