I want to download files in parallel in go, but my code never exits:
package main
import (
"fmt"
"io"
"net/http"
"os"
"path/filepath"
"sync"
)
func download_file(file_path string, wg sync.WaitGroup) {
defer wg.Done()
resp, _ := http.Get(file_path)
defer resp.Body.Close()
filename := filepath.Base(file_path)
file, _ := os.Create(filename)
defer file.Close()
size, _ := io.Copy(file, resp.Body)
fmt.Println(filename, size, resp.Status)
}
func main() {
var wg sync.WaitGroup
file_list := []string{
"http://i.imgur.com/dxGb2uZ.jpg",
"http://i.imgur.com/RSU6NxX.jpg",
"http://i.imgur.com/hUWgS2S.jpg",
"http://i.imgur.com/U8kaix0.jpg",
"http://i.imgur.com/w3cEYpY.jpg",
"http://i.imgur.com/ooSCD9T.jpg"}
fmt.Println(len(file_list))
for _, url := range file_list {
wg.Add(1)
fmt.Println(wg)
go download_file(url, wg)
}
wg.Wait()
}
What's the reason? I've looked here: Golang download multiple files in parallel using goroutines but I found no solution. What is the best way to debug such code?
As Tim Cooper said you need to pass the WaitGroup
as a pointer. If you run the go vet
tool on your code it will give you this warning:
$ go vet ex.go
ex.go:12: download_file passes Lock by value: sync.WaitGroup contains sync.Mutex
exit status 1
I recommend using an editor that can do this for you when you save a file. For example go-plus for Atom.
As for the code I think you should restructure it like this:
package main
import (
"fmt"
"io"
"net/http"
"os"
"path/filepath"
"sync"
)
func downloadFile(filePath string) error {
resp, err := http.Get(filePath)
if err != nil {
return err
}
defer resp.Body.Close()
name := filepath.Base(filePath)
file, err := os.Create(name)
if err != nil {
return err
}
defer file.Close()
size, err := io.Copy(file, resp.Body)
if err != nil {
return err
}
fmt.Println(name, size, resp.Status)
return nil
}
func main() {
var wg sync.WaitGroup
fileList := []string{
"http://i.imgur.com/dxGb2uZ.jpg",
"http://i.imgur.com/RSU6NxX.jpg",
"http://i.imgur.com/hUWgS2S.jpg",
"http://i.imgur.com/U8kaix0.jpg",
"http://i.imgur.com/w3cEYpY.jpg",
"http://i.imgur.com/ooSCD9T.jpg"}
fmt.Println("downloading", len(fileList), "files")
for _, url := range fileList {
wg.Add(1)
go func(url string) {
err := downloadFile(url)
if err != nil {
fmt.Println("[error]", url, err)
}
wg.Done()
}(url)
}
wg.Wait()
}
I don't like passing WaitGroup
s around and prefer to keep functions simple, blocking and sequential and then stitch together the concurrency at a higher level. This gives you the option of doing it all sequentially without having to change downloadFile
.
I also added error handling and fixed names so they are camelCase.
Adding to Calab's response, there's absolutely nothing wrong with your approach, all you had to do is to pass a pointer to the sync.WaitGroup
.
func download_file(file_path string, wg *sync.WaitGroup) {
defer wg.Done()
......
}
.....
go download_file(url, &wg)
.....
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.