简体   繁体   中英

GKE pod not able to access private GCS Bucket (401 unauthorized error)

I am developing an application in Python which will be deployed on Google Kubernetes Engine (GKE) pod. The application involves writing and reading.csv files to Google Cloud Storage (private Google bucket). I am facing an error while trying to read/write files to the Google bucket. The read/write to private google bucket is working when I run the application on my local system.

The operation is failing when the application is deployed to the GKE pod.

The GKE pod in the cluster is not able to access the private GCS bucket even though I am providing credentials the same as local system. Following are some of the details regarding the application and the error which I am facing:

  • DockerFile: The docker file contain a reference to the cred.json file which contains credentials of the google cloud service account. The service account has permissions of google cloud storage admin.
FROM python:3.9.10-slim-buster
WORKDIR /pipeline
COPY . .
RUN pip3 install --no-cache-dir -r requirements.txt

EXPOSE 3000

ENV GOOGLE_APPLICATION_CREDENTIALS=/pipeline/cred.json
ENV GIT_PYTHON_REFRESH=quiet
  • requirements.txt: Following is the requirements.txt file content (I have included only google cloud related packages as they are relevant related to the error):
google-api-core==2.8.2
google-auth==2.9.0
google-auth-oauthlib==0.5.2
google-cloud-bigquery==3.2.0
google-cloud-bigquery-storage==2.11.0
google-cloud-core==2.3.1
google-cloud-storage==2.4.0
google-crc32c==1.3.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.3
fsspec==2022.8.2
gcsfs==2022.8.2
gevent==21.12.0
  • Error details: Following is the traceback:
Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
ERROR:gcsfs:_request non-retriable exception: Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 115, in retry_request
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 384, in _request
    validate_response(status, contents, path, args)
  File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 102, in validate_response
    raise HttpError(error)
gcsfs.retry.HttpError: Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/pipeline/training/train.py", line 133, in training
    X.to_csv(file_name, index=False)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/generic.py", line 3563, in to_csv
    return DataFrameRenderer(formatter).to_csv(
  File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/format.py", line 1180, in to_csv
    csv_formatter.save()
  File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 261, in save
    self._save()
  File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 266, in _save
    self._save_body()
  File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 304, in _save_body
    self._save_chunk(start_i, end_i)
  File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 315, in _save_chunk
    libwriters.write_csv_rows(
  File "pandas/_libs/writers.pyx", line 72, in pandas._libs.writers.write_csv_rows
  File "/usr/local/lib/python3.9/site-packages/fsspec/spec.py", line 1491, in write
    self.flush()
  File "/usr/local/lib/python3.9/site-packages/fsspec/spec.py", line 1527, in flush
    self._initiate_upload()
  File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 1443, in _initiate_upload
    self.location = sync(
  File "/usr/local/lib/python3.9/site-packages/fsspec/asyn.py", line 96, in sync
    raise return_result
  File "/usr/local/lib/python3.9/site-packages/fsspec/asyn.py", line 53, in _runner
    result[0] = await coro
  File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 1559, in initiate_upload
    headers, _ = await fs._call(
  File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 392, in _call
    status, headers, info, contents = await self._request(
  File "/usr/local/lib/python3.9/site-packages/decorator.py", line 221, in fun
    return await caller(func, *(extras + args), **kw)
  File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 152, in retry_request
    raise e
  File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 115, in retry_request
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 384, in _request
    validate_response(status, contents, path, args)
  File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 102, in validate_response
    raise HttpError(error)
gcsfs.retry.HttpError: Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401

I also tried running the application by making the google cloud bucket public. With this approach, the read and write operations to google cloud bucket are working.

The problem arises when the google cloud bucket is private (which is essential for application deployment).

Any help to resolve this error will be appreciated. Thanks in advance!!

You are able to read/write from local system because you might you be using your credential or impersonating the SA that has permission to access the private bucket. FYI - if you are access bucket cross-project then the SA should be granted required permission in the the project bucket is in.

One thing you can do is grant the SA that you are using to run the gke pod required permission (instead of explicitly setting the credentials GOOGLE_APPLICATION_CREDENTIALS ) and can access the credentials with google.auth.default() wherever needed.

PS: If the SA running your gke pod has storage access permission in project the bucket you are trying to access is in, then you should be just fine.

Hope this helps:)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM