简体   繁体   English

带有缓冲通道的死锁

[英]Deadlock with buffered channel

I have some code that is a job dispatcher and is collating a large amount of data from lots of TCP sockets. 我有一些代码是作业调度程序,正在整理来自许多TCP套接字的大量数据。 This code is a result of an approach to Large number of transient objects - avoiding contention and it largely works with CPU usage down a huge amount and locking not an issue now either. 这段代码是使用大量瞬态对象的方法的结果-避免了争用,并且在很大程度上降低了CPU使用率并且现在锁定也不是问题。

From time to time my application locks up and the "Channel length" log is the only thing that keeps repeating as data is still coming in from my sockets. 我的应用程序有时会锁定,并且“通道长度”日志是唯一重复发生的事情,因为仍然有数据从我的套接字输入。 However the count remains at 5000 and no downstream processing is taking place. 但是,计数仍为5000,并且不进行任何下游处理。

I think the issue might be a race condition and the line it is possibly getting hung up on is channel <- msg within the select of the jobDispatcher . 认为问题可能是竞赛情况,并且可能挂断的线路是jobDispatcher select中的channel <- msg jobDispatcher Trouble is I can't work out how to verify this. 问题是我不知道如何验证这一点。

I suspect that as select can take items at random the goroutine is returning and the shutdownChan doesn't have a chance to process. 我怀疑由于select可以随机获取项目,因此goroutine正在返回,shutdownChan没有机会进行处理。 Then data hits inboundFromTCP and it blocks! 然后数据命中inboundFromTCP并阻塞!

Someone might spot something really obviously wrong here. 可能有人在这里发现了明显错误的地方。 And offer a solution hopefully!? 并希望提供解决方案!?

var MessageQueue = make(chan *trackingPacket_v1, 5000)

func init() {
    go jobDispatcher(MessageQueue)
}

func addMessage(trackingPacket *trackingPacket_v1) {
    // Send the packet to the buffered queue!
    log.Println("Channel length:", len(MessageQueue))
    MessageQueue <- trackingPacket
}

func jobDispatcher(inboundFromTCP chan *trackingPacket_v1) {
    var channelMap = make(map[string]chan *trackingPacket_v1)

    // Channel that listens for the strings that want to exit
    shutdownChan := make(chan string)

    for {
        select {
        case msg := <-inboundFromTCP:
            log.Println("Got packet", msg.Avr)
            channel, ok := channelMap[msg.Avr]
            if !ok {
                packetChan := make(chan *trackingPacket_v1)

                channelMap[msg.Avr] = packetChan
                go processPackets(packetChan, shutdownChan, msg.Avr)
                packetChan <- msg
                continue
            }
            channel <- msg
        case shutdownString := <-shutdownChan:
            log.Println("Shutting down:", shutdownString)
            channel, ok := channelMap[shutdownString]
            if ok {
                delete(channelMap, shutdownString)
                close(channel)
            }
        }
    }
}

func processPackets(ch chan *trackingPacket_v1, shutdown chan string, id string) {
    var messages = []*trackingPacket_v1{}

    tickChan := time.NewTicker(time.Second * 1)
    defer tickChan.Stop()

    hasCheckedData := false

    for {
        select {
        case msg := <-ch:
            log.Println("Got a messages for", id)
            messages = append(messages, msg)
            hasCheckedData = false
        case <-tickChan.C:

            messages = cullChanMessages(messages)
            if len(messages) == 0 {
                messages = nil
                shutdown <- id
                return
            }

            // No point running checking when packets have not changed!!
            if hasCheckedData == false {
                processMLATCandidatesFromChan(messages)
                hasCheckedData = true
            }
        case <-time.After(time.Duration(time.Second * 60)):
            log.Println("This channel has been around for 60 seconds which is too much, kill it")
            messages = nil
            shutdown <- id
            return
        }
    }
}

Update 01/20/16 更新01/20/16

I tried to rework with the channelMap as a global with some mutex locking but it ended up deadlocking still. 我尝试将channelMap用作带有一些互斥锁的全局对象,但最终仍然死锁。


Slightly tweaked the code, still locks but I don't see how this one does!! 稍微调整了代码,仍然锁定,但我不知道这是怎么做的! https://play.golang.org/p/PGpISU4XBJ https://play.golang.org/p/PGpISU4XBJ


Update 01/21/17 After some recommendations I put this into a standalone working example so people can see. 更新01/21/17在提出一些建议之后,我将其放入一个独立的工作示例中,以便人们看到。 https://play.golang.org/p/88zT7hBLeD https://play.golang.org/p/88zT7hBLeD

It is a long running process so will need running locally on a machine as the playground kills it. 这是一个长期运行的过程,因此在游乐场将其杀死时,需要在计算机上本地运行。 Hopefully this will help get to the bottom of it! 希望这可以帮助您找到根底!

I'm guessing that your problem is getting stuck doing this channel <- msg at the same time as the other goroutine is doing shutdown <- id . 我猜测您的问题在另一个goroutine正在执行shutdown <- id的同时,正在执行此channel <- msg -msg。

Since neither the channel nor the shutdown channels are buffered, they block waiting for a receiver. 由于channelshutdown通道均未缓冲,因此它们阻塞了等待接收器的时间。 And they can deadlock waiting for the other side to become available. 他们可以陷入僵局,等待另一边变得可用。

There are a couple of ways to fix it. 有几种解决方法。 You could declare both of those channels with a buffer of 1. 您可以使用缓冲区1声明这两个通道。

Or instead of signalling by sending a shutdown message, you could do what Google's context package does and send a shutdown signal by closing the shutdown channel. 或者,您可以执行Google的上下文包的操作,并通过关闭关闭通道来发送关闭信号,而不是通过发送关闭消息来发出信号。 Look at https://golang.org/pkg/context/ especially WithCancel , WithDeadline and the Done functions. 查看https://golang.org/pkg/context/,尤其是WithCancelWithDeadlineDone函数。

You might be able to use context to remove your own shutdown channel and timeout code. 您可能可以使用上下文来删除自己的关闭通道和超时代码。

And JimB has a point about shutting down the goroutine while it might still be receiving on the channel. JimB提出了在可能仍在通道上接收goroutine的同时将其关闭的观点。 What you should do is send the shutdown message (or close, or cancel the context) and continue to process messages until your ch channel is closed (detect that with case msg, ok := <-ch: ), which would happen after the shutdown is received by the sender. 您应该执行的操作是发送关闭消息(或关闭或取消上下文),并继续处理消息,直到关闭您的ch通道(以case msg, ok := <-ch: ,这将在发送方收到关机信息。

That way you get all of the messages that were incoming until the shutdown actually happened, and should avoid a second deadlock. 这样,您将获得所有传入的消息,直到实际发生关闭为止,并且应该避免再次出现死锁。

I'm new to Go but in this code here 我是Go的新手,但是在这里的代码中

case msg := <-inboundFromTCP:
        log.Println("Got packet", msg.Avr)
        channel, ok := channelMap[msg.Avr]
        if !ok {
            packetChan := make(chan *trackingPacket_v1)

            channelMap[msg.Avr] = packetChan
            go processPackets(packetChan, shutdownChan, msg.Avr)
            packetChan <- msg
            continue
        }
        channel <- msg

Aren't you putting something in channel (unbuffered?) here 您不是在这里放一些东西(无缓冲吗?)

channel, ok := channelMap[msg.Avr]

So wouldn't you need to empty out that channel before you can add the msg here? 因此,在添加味精之前,您是否需要清空该频道?

channel <- msg

Like I said, I'm new to Go so I hope I'm not being goofy. 就像我说的那样,我是Go的新手,所以我希望自己不要傻。 :) :)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM