简体   繁体   中英

downloading files with goroutines?

I'm new to Go and I'm learning how to work with goroutines.

I have a function that downloads images:

func imageDownloader(uri string, filename string) {
    fmt.Println("starting download for ", uri)

    outFile, err := os.Create(filename)
    defer outFile.Close()
    if err != nil {
        os.Exit(1)
    }

    client := &http.Client{}

    req, err := http.NewRequest("GET", uri, nil)

    resp, err := client.Do(req)
    defer resp.Body.Close()

    if err != nil {
        panic(err)
    }

    header := resp.ContentLength
    bar := pb.New(int(header))
    rd := bar.NewProxyReader(resp.Body)
    // and copy from reader
    io.Copy(outFile, rd)
}

When I call by itself as part of another function, it downloads images completely and there is no truncated data.

However, when I try to modify it to make it a goroutine, images are often truncated or zero length files.

func imageDownloader(uri string, filename string, wg *sync.WaitGroup) {
    ...
    io.Copy(outFile, rd)
    wg.Done()
}

func main() {
var wg sync.WaitGroup
wg.Add(1)
go imageDownloader(url, file, &wg)
wg.Wait()
}

Am I using WaitGroups incorrectly? What could cause this and how can I fix it?

Update:

Solved it. I had placed the wg.add() function outside of a loop. :(

While I'm not sure exactly what's causing your issue, here's two options for how to get it back into working order.

First, looking to the example of how to use waitgroups from the sync library, try calling defer wg.Done() at the beginning of your function to ensure that even if the goroutine ends unexpectedly, that the waitgroup is properly decremented.

Second, io.Copy returns an error that you're not checking. That's not great practice anyway, but in your particular case it's preventing you from seeing if there is indeed an error in the copying routine. Check it and deal with it appropriately. It also returns the number of bytes written, which might help you as well.

Your example doesn't have anything obviously wrong with its use of WaitGroups. As long as you are calling wg.Add() with the same number as the number of goroutines you launch, or incrementing it by 1 every time you start a new goroutine, that should be correct.

However you call os.Exit and panic for certain errors conditions in the goroutine, so if you have more than one of these running, a failure in any one of them will terminate all of them, regardless of the use of WaitGroups. If it's failing without a panic message, I would take a look at the os.Exit(1) line.

It would also, be good practice in go to use defer wg.Done() at the start of your function, so that even if an error occurs, the goroutine still decrements its counter. That way your main thread won't hang on completion if one of the goroutines returns an error.

One change I would make in your example is leverage defer when you are Done . I think this defer ws.Done() should be the first statement in your function.

I like WaitGroup 's simplicity. However, I do not like that we need to pass the reference to the goroutine because that would mean that the concurrency logic would be mixed with your business logic.

So I came up with this generic function to solve this problem for me:

// Parallelize parallelizes the function calls
func Parallelize(functions ...func()) {
    var waitGroup sync.WaitGroup
    waitGroup.Add(len(functions))

    defer waitGroup.Wait()

    for _, function := range functions {
        go func(copy func()) {
            defer waitGroup.Done()
            copy()
        }(function)
    }
}

So your example could be solved this way:

func imageDownloader(uri string, filename string) {
    ...
    io.Copy(outFile, rd)
}

func main() {
    functions := []func(){}
    list := make([]Object, 5)
    for _, object := range list {
        function := func(obj Object){ 
            imageDownloader(object.uri, object.filename) 
        }(object)
        functions = append(functions, function)
    }

    Parallelize(functions...)        

    fmt.Println("Done")
}

If you would like to use it, you can find it here https://github.com/shomali11/util

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM