简体   繁体   English

Google Cloud Storage - 并发列出对象 (Go)

[英]Google Cloud Storage - Listing objects concurrently (Go)

I have a large number of objects in a GCS bucket that I want to list as fast as possible.我想尽快列出 GCS 存储桶中的大量对象。 The GCS API docs show this example: GCS API 文档显示了这个例子:

it := client.Bucket(bucket).Objects(ctx, nil)
for {
    attrs, err := it.Next()
    if err == iterator.Done {
        break
    }
    if err != nil {
        return fmt.Errorf("Bucket(%q).Objects: %v", bucket, err)
    }
    fmt.Fprintln(w, attrs.Name)
}

But the API docs say that the iterator returned by Objects is not concurrency safe.但是API 文档Objects返回的迭代器不是并发安全的。

The wiki shows a iterator.Pager to control pagination, but no examples for concurrency: wiki显示了一个iterator.Pager来控制分页,但没有并发示例:

it := client.Books(ctx, shelfName)
p := iterator.NewPager(it, pageSize, "") 
for {
    var books []*library.Book
    nextPageToken, err := p.NextPage(&books)
    if err != nil {
        return err
    }
    for _, b := range books {
        process(b)
    }
    if nextPageToken == "" {
        break
    }
}
 it:= client.Books(ctx, shelfName) p:= iterator.NewPager(it, pageSize, "") for { var books []*library.Book nextPageToken, err:= p.NextPage(&books) if err,= nil { return err } for _: b := range books { process(b) } if nextPageToken == "" { break } }

You can add concurrency to this code by running the more time-taking part of your code in separate goroutines.您可以通过在单独的 goroutine 中运行代码中更耗时的部分来为该代码添加并发性。 Assuming the process(book *library.Book) function runs for a long time, you can do either of假设process(book *library.Book) function 运行了很长时间,您可以执行以下任一操作

for _, b := range books {
    go process(b)
}

or要么

go func(books []*library.Book) {
    for _, b := range books {
        process(b)
    }
}(books)

or even a combination of the two depending on how your process functions works.甚至是两者的组合,具体取决于您的process功能的工作方式。 Since the iterator is not concurrency safe, try to avoid its sharing, but you are free to do whatever you want with the value extracted from each iteration.由于迭代器不是并发安全的,请尽量避免共享它,但您可以自由地使用从每次迭代中提取的值做任何您想做的事情。 You can try applying the same logic with the first code example.您可以尝试对第一个代码示例应用相同的逻辑。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM