简体   繁体   中英

Google Cloud Storage - Listing objects concurrently (Go)

I have a large number of objects in a GCS bucket that I want to list as fast as possible. The GCS API docs show this example:

it := client.Bucket(bucket).Objects(ctx, nil)
for {
    attrs, err := it.Next()
    if err == iterator.Done {
        break
    }
    if err != nil {
        return fmt.Errorf("Bucket(%q).Objects: %v", bucket, err)
    }
    fmt.Fprintln(w, attrs.Name)
}

But the API docs say that the iterator returned by Objects is not concurrency safe.

The wiki shows a iterator.Pager to control pagination, but no examples for concurrency:

it := client.Books(ctx, shelfName)
p := iterator.NewPager(it, pageSize, "") 
for {
    var books []*library.Book
    nextPageToken, err := p.NextPage(&books)
    if err != nil {
        return err
    }
    for _, b := range books {
        process(b)
    }
    if nextPageToken == "" {
        break
    }
}
 it:= client.Books(ctx, shelfName) p:= iterator.NewPager(it, pageSize, "") for { var books []*library.Book nextPageToken, err:= p.NextPage(&books) if err,= nil { return err } for _: b := range books { process(b) } if nextPageToken == "" { break } }

You can add concurrency to this code by running the more time-taking part of your code in separate goroutines. Assuming the process(book *library.Book) function runs for a long time, you can do either of

for _, b := range books {
    go process(b)
}

or

go func(books []*library.Book) {
    for _, b := range books {
        process(b)
    }
}(books)

or even a combination of the two depending on how your process functions works. Since the iterator is not concurrency safe, try to avoid its sharing, but you are free to do whatever you want with the value extracted from each iteration. You can try applying the same logic with the first code example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM