简体   繁体   中英

Cloud Run downloading a file from GCS is insanely slow

I have a Go cloud run app and when it starts, it downloads a 512mb file from GCS (it needs this for the program). Locally on my nothin-special home connection this works fine and it downloads in a few seconds, but when I deploy this to cloud run it downloads like a snail. I had to increase timeouts and log a progress counter in just to make sure it was doing something (it was). It might be downloading at about 30Kb/s which is not gonna work.

The cloud run instance and GCS regional bucket are both in us-east4. It doesn't seem like there are any knobs I can play with to get this to work and I don't see this issue/constraint documented.

Anyone have any ideas what could be the issue?

Here is the code doing the downloading, along with copious logging because I couldn't tell if it was doing anything at first:

func LoadFilter() error {
    fmt.Println("loading filter")
    ctx := context.Background()
    storageClient, err := storage.NewClient(ctx)
    if err != nil {
        return err
    }
    defer storageClient.Close()

    ctx, cancel := context.WithTimeout(ctx, time.Minute*60)
    defer cancel()

    obj := storageClient.Bucket("my_slow_bucket").Object("filter_export")
    rc, err := obj.NewReader(ctx)
    if err != nil {
        return err
    }
    defer rc.Close()

    attrs, err := obj.Attrs(ctx)
    if err != nil {
        return err
    }
    progressR := &ioprogress.Reader{
        Reader: rc,
        Size:   attrs.Size,
        DrawFunc: func(p int64, t int64) error {
            fmt.Printf("%.2f\n", float64(p)/float64(t)*100)
            return nil
        },
    }

    fmt.Println("reading filter...")
    data, err := ioutil.ReadAll(progressR)
    if err != nil {
        return err
    }

    fmt.Println("decoding filter...")
    filter, err := cuckoo.Decode(data)
    if err != nil {
        return err
    }

    fmt.Println("filter decoded")

    cf = filter

    fmt.Println("initailized the filter successfully!")

    return nil
}

Indeed what @wlhee said is perfectly true. if you have any activities that run outside or request pipeline, these activities will not have access to the full CPU provided to your instances. As the documentation says:

When an application running on Cloud Run finishes handling a request, the container instance's access to CPU will be disabled or severely limited. Therefore, you should not start background threads or routines that run outside the scope of the request handlers.

Running background threads can result in unexpected behavior because any subsequent request to the same container instance resumes any suspended background activity.

I suggest that you run this download activity from Cloud Storage upon a request to your services by hitting some startup endpoint in your app, finish the download then return a response to indicate a request ends.

Please, check this documentation for tips on Cloud Run

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM