I am developing an application in Python which will be deployed on Google Kubernetes Engine (GKE) pod. The application involves writing and reading.csv files to Google Cloud Storage (private Google bucket). I am facing an error while trying to read/write files to the Google bucket. The read/write to private google bucket is working when I run the application on my local system.
The operation is failing when the application is deployed to the GKE pod.
The GKE pod in the cluster is not able to access the private GCS bucket even though I am providing credentials the same as local system. Following are some of the details regarding the application and the error which I am facing:
FROM python:3.9.10-slim-buster
WORKDIR /pipeline
COPY . .
RUN pip3 install --no-cache-dir -r requirements.txt
EXPOSE 3000
ENV GOOGLE_APPLICATION_CREDENTIALS=/pipeline/cred.json
ENV GIT_PYTHON_REFRESH=quiet
google-api-core==2.8.2
google-auth==2.9.0
google-auth-oauthlib==0.5.2
google-cloud-bigquery==3.2.0
google-cloud-bigquery-storage==2.11.0
google-cloud-core==2.3.1
google-cloud-storage==2.4.0
google-crc32c==1.3.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.3
fsspec==2022.8.2
gcsfs==2022.8.2
gevent==21.12.0
Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
ERROR:gcsfs:_request non-retriable exception: Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 115, in retry_request
return await func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 384, in _request
validate_response(status, contents, path, args)
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 102, in validate_response
raise HttpError(error)
gcsfs.retry.HttpError: Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.9/threading.py", line 910, in run
self._target(*self._args, **self._kwargs)
File "/pipeline/training/train.py", line 133, in training
X.to_csv(file_name, index=False)
File "/usr/local/lib/python3.9/site-packages/pandas/core/generic.py", line 3563, in to_csv
return DataFrameRenderer(formatter).to_csv(
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/format.py", line 1180, in to_csv
csv_formatter.save()
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 261, in save
self._save()
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 266, in _save
self._save_body()
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 304, in _save_body
self._save_chunk(start_i, end_i)
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 315, in _save_chunk
libwriters.write_csv_rows(
File "pandas/_libs/writers.pyx", line 72, in pandas._libs.writers.write_csv_rows
File "/usr/local/lib/python3.9/site-packages/fsspec/spec.py", line 1491, in write
self.flush()
File "/usr/local/lib/python3.9/site-packages/fsspec/spec.py", line 1527, in flush
self._initiate_upload()
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 1443, in _initiate_upload
self.location = sync(
File "/usr/local/lib/python3.9/site-packages/fsspec/asyn.py", line 96, in sync
raise return_result
File "/usr/local/lib/python3.9/site-packages/fsspec/asyn.py", line 53, in _runner
result[0] = await coro
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 1559, in initiate_upload
headers, _ = await fs._call(
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 392, in _call
status, headers, info, contents = await self._request(
File "/usr/local/lib/python3.9/site-packages/decorator.py", line 221, in fun
return await caller(func, *(extras + args), **kw)
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 152, in retry_request
raise e
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 115, in retry_request
return await func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 384, in _request
validate_response(status, contents, path, args)
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 102, in validate_response
raise HttpError(error)
gcsfs.retry.HttpError: Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
I also tried running the application by making the google cloud bucket public. With this approach, the read and write operations to google cloud bucket are working.
The problem arises when the google cloud bucket is private (which is essential for application deployment).
Any help to resolve this error will be appreciated. Thanks in advance!!
You are able to read/write from local system because you might you be using your credential or impersonating the SA that has permission to access the private bucket. FYI - if you are access bucket cross-project then the SA should be granted required permission in the the project bucket is in.
One thing you can do is grant the SA that you are using to run the gke pod
required permission (instead of explicitly setting the credentials GOOGLE_APPLICATION_CREDENTIALS
) and can access the credentials with google.auth.default()
wherever needed.
PS: If the SA running your gke pod
has storage access permission in project the bucket you are trying to access is in, then you should be just fine.
Hope this helps:)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.