简体   繁体   English

Golang卡在WaitGroup中

[英]Golang stuck in WaitGroup

I'm stuck in my own wait loop and not really sure why. 我陷入了自己的等待循环,不确定为什么。 The function takes an input and output channel, then takes each item in the channel, executes an http.GET for the content and pulls the tag from the html. 该函数采用输入和输出通道,然后采用通道中的每个项目,对内容执行http.GET并从html中提取标记。

The process to GET and scrape is inside a go routine, and I've set up a wait group (innerWait) to be sure that I've processed everything before closing the output channel. GET和scrape的过程在go例程中,并且我已经建立了一个等待组(innerWait),以确保在关闭输出通道之前已经处理了所有内容。

   func (fp FeedProducer) getTitles(in <-chan feeds.Item,
    out chan<- feeds.Item,
    wg *sync.WaitGroup) {

    defer wg.Done()

    var innerWait sync.WaitGroup

    for item := range in {
        log.Infof(fp.c, "Incrementing inner WaitGroup.")
        innerWait.Add(1)
        go func(item feeds.Item) {
            defer innerWait.Done()
            defer log.Infof(fp.c, "Decriment inner wait group by defer.")
            client := urlfetch.Client(fp.c)
            resp, err := client.Get(item.Link.Href)
            log.Infof(fp.c, "Getting title for: %v", item.Link.Href)
            if err != nil {
                log.Errorf(fp.c, "Error retriving page. %v", err.Error())
                return
            }
            if strings.ToLower(resp.Header.Get("Content-Type")) == "text/html; charset=utf-8" {
                title := fp.scrapeTitle(resp)
                item.Title = title
            } else {
                log.Errorf(fp.c, "Wrong content type.  Received: %v from %v", resp.Header.Get("Content-Type"), item.Link.Href)
            }
            out <- item
        }(item)
    }
    log.Infof(fp.c, "Waiting for title pull wait group.")
    innerWait.Wait()
    log.Infof(fp.c, "Done waiting for title pull.")
    close(out)
}

func (fp FeedProducer) scrapeTitle(request *http.Response) string {
    defer request.Body.Close()
    tokenizer := html.NewTokenizer(request.Body)
    var titleIsNext bool
    for {
        token := tokenizer.Next()
        switch {
        case token == html.ErrorToken:
            log.Infof(fp.c, "Hit the end of the doc without finding title.")
            return ""
        case token == html.StartTagToken:
            tag := tokenizer.Token()
            isTitle := tag.Data == "title"

            if isTitle {
                titleIsNext = true
            }
        case titleIsNext && token == html.TextToken:
            title := tokenizer.Token().Data
            log.Infof(fp.c, "Pulled title: %v", title)
            return title
        }
    }
}

Log content looks like this: 日志内容如下所示:

2015/08/09 22:02:10 INFO: Revived query parameter: golang
2015/08/09 22:02:10 INFO: Getting active tweets from the last 7 days.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Incrementing inner WaitGroup.
2015/08/09 22:02:10 INFO: Waiting for title pull wait group.
2015/08/09 22:02:10 INFO: Getting title for: http://devsisters.github.io/goquic/
2015/08/09 22:02:10 INFO: Pulled title: GoQuic by devsisters
2015/08/09 22:02:10 INFO: Getting title for: http://whizdumb.me/2015/03/03/matching-a-string-and-extracting-values-using-regex/
2015/08/09 22:02:10 INFO: Pulled title: Matching a string and extracting values using regex | Whizdumb's blog
2015/08/09 22:02:10 INFO: Getting title for: https://www.reddit.com/r/golang/comments/3g7tyv/dropboxs_infrastructure_is_go_at_a_huge_scale/
2015/08/09 22:02:10 INFO: Pulled title: Dropbox's infrastructure is Go at a huge scale : golang
2015/08/09 22:02:10 INFO: Getting title for: http://dave.cheney.net/2015/08/08/performance-without-the-event-loop
2015/08/09 22:02:10 INFO: Pulled title: Performance without the event loop | Dave Cheney
2015/08/09 22:02:11 INFO: Getting title for: https://github.com/ccirello/sublime-gosnippets
2015/08/09 22:02:11 INFO: Pulled title: ccirello/sublime-gosnippets · GitHub
2015/08/09 22:02:11 INFO: Getting title for: https://medium.com/iron-io-blog/an-easier-way-to-create-tiny-golang-docker-images-7ba2893b160?mkt_tok=3RkMMJWWfF9wsRonuqTMZKXonjHpfsX57ewoWaexlMI/0ER3fOvrPUfGjI4ATsNrI%2BSLDwEYGJlv6SgFQ7LMMaZq1rgMXBk%3D&utm_content=buffer45a1c&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer
2015/08/09 22:02:11 INFO: Pulled title: An Easier Way to Create Tiny Golang Docker Images — Iron.io Blog — Medium

I can see that I'm getting to the innerWait.Wait() command based on the logs, which also tells me that the inbound channel has been closed on the other side of the pipe. 我可以看到我根据日志进入了innerWait.Wait()命令,该命令还告诉我入站通道在管道的另一侧已关闭。

It would appear that the defer statements in the anonymous function are not being called, as I can't see the deferred log statement printed anywhere. 似乎未调用匿名函数中的defer语句,因为我看不到任何地方都打印了deferred log语句。 But I can't for the life of me tell why as all code in that block appears to execute. 但是我无法终生说出为什么该块中的所有代码似乎都可以执行。

Help is appreciated. 感谢帮助。

The goroutines are stuck sending to out at this line: goroutine停留在此行发送out

        out <- item

The fix is to start a goroutine to receive on out . 解决方法是启动一个够程收到有关out

A good way to debug issues like this is to dump the goroutine stacks by sending the process a SIGQUIT. 调试此类问题的一种好方法是通过向进程发送SIGQUIT来转储goroutine堆栈。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM