简体   繁体   中英

Golang concurrent download deadlock

I want to download files in parallel in go, but my code never exits:

package main

import (
    "fmt"
    "io"
    "net/http"
    "os"
    "path/filepath"
    "sync"
)

func download_file(file_path string, wg sync.WaitGroup) {
    defer wg.Done()
    resp, _ := http.Get(file_path)
    defer resp.Body.Close()
    filename := filepath.Base(file_path)
    file, _ := os.Create(filename)
    defer file.Close()

    size, _ := io.Copy(file, resp.Body)
    fmt.Println(filename, size, resp.Status)
}

func main() {
    var wg sync.WaitGroup

    file_list := []string{
        "http://i.imgur.com/dxGb2uZ.jpg",
        "http://i.imgur.com/RSU6NxX.jpg",
        "http://i.imgur.com/hUWgS2S.jpg",
        "http://i.imgur.com/U8kaix0.jpg",
        "http://i.imgur.com/w3cEYpY.jpg",
        "http://i.imgur.com/ooSCD9T.jpg"}
    fmt.Println(len(file_list))
    for _, url := range file_list {
        wg.Add(1)
        fmt.Println(wg)
        go download_file(url, wg)

    }
    wg.Wait()
}

What's the reason? I've looked here: Golang download multiple files in parallel using goroutines but I found no solution. What is the best way to debug such code?

As Tim Cooper said you need to pass the WaitGroup as a pointer. If you run the go vet tool on your code it will give you this warning:

$ go vet ex.go
ex.go:12: download_file passes Lock by value: sync.WaitGroup contains sync.Mutex
exit status 1

I recommend using an editor that can do this for you when you save a file. For example go-plus for Atom.

As for the code I think you should restructure it like this:

package main

import (
    "fmt"
    "io"
    "net/http"
    "os"
    "path/filepath"
    "sync"
)

func downloadFile(filePath string) error {
    resp, err := http.Get(filePath)
    if err != nil {
        return err
    }
    defer resp.Body.Close()

    name := filepath.Base(filePath)

    file, err := os.Create(name)
    if err != nil {
        return err
    }
    defer file.Close()

    size, err := io.Copy(file, resp.Body)
    if err != nil {
        return err
    }
    fmt.Println(name, size, resp.Status)
    return nil
}

func main() {
    var wg sync.WaitGroup

    fileList := []string{
        "http://i.imgur.com/dxGb2uZ.jpg",
        "http://i.imgur.com/RSU6NxX.jpg",
        "http://i.imgur.com/hUWgS2S.jpg",
        "http://i.imgur.com/U8kaix0.jpg",
        "http://i.imgur.com/w3cEYpY.jpg",
        "http://i.imgur.com/ooSCD9T.jpg"}
    fmt.Println("downloading", len(fileList), "files")
    for _, url := range fileList {
        wg.Add(1)
        go func(url string) {
            err := downloadFile(url)
            if err != nil {
                fmt.Println("[error]", url, err)
            }
            wg.Done()
        }(url)
    }
    wg.Wait()
}

I don't like passing WaitGroup s around and prefer to keep functions simple, blocking and sequential and then stitch together the concurrency at a higher level. This gives you the option of doing it all sequentially without having to change downloadFile .

I also added error handling and fixed names so they are camelCase.

Adding to Calab's response, there's absolutely nothing wrong with your approach, all you had to do is to pass a pointer to the sync.WaitGroup .

func download_file(file_path string, wg *sync.WaitGroup) {
    defer wg.Done()
    ......
}
.....
        go download_file(url, &wg)
.....

playground

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM