[英]GKE pod not able to access private GCS Bucket (401 unauthorized error)
I am developing an application in Python which will be deployed on Google Kubernetes Engine (GKE) pod.我正在 Python 中开发一个应用程序,它将部署在 Google Kubernetes Engine (GKE) pod 上。 The application involves writing and reading.csv files to Google Cloud Storage (private Google bucket).该应用程序涉及将.csv 文件写入和读取到 Google Cloud Storage(私有 Google 存储桶)。 I am facing an error while trying to read/write files to the Google bucket.我在尝试将文件读/写到 Google 存储桶时遇到错误。 The read/write to private google bucket is working when I run the application on my local system.当我在本地系统上运行应用程序时,对私有谷歌存储桶的读/写正在工作。
The operation is failing when the application is deployed to the GKE pod.将应用程序部署到 GKE pod 时操作失败。
The GKE pod in the cluster is not able to access the private GCS bucket even though I am providing credentials the same as local system.即使我提供与本地系统相同的凭据,集群中的 GKE pod 也无法访问私有 GCS 存储桶。 Following are some of the details regarding the application and the error which I am facing:以下是有关应用程序的一些详细信息以及我面临的错误:
FROM python:3.9.10-slim-buster
WORKDIR /pipeline
COPY . .
RUN pip3 install --no-cache-dir -r requirements.txt
EXPOSE 3000
ENV GOOGLE_APPLICATION_CREDENTIALS=/pipeline/cred.json
ENV GIT_PYTHON_REFRESH=quiet
google-api-core==2.8.2
google-auth==2.9.0
google-auth-oauthlib==0.5.2
google-cloud-bigquery==3.2.0
google-cloud-bigquery-storage==2.11.0
google-cloud-core==2.3.1
google-cloud-storage==2.4.0
google-crc32c==1.3.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.3
fsspec==2022.8.2
gcsfs==2022.8.2
gevent==21.12.0
Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
ERROR:gcsfs:_request non-retriable exception: Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 115, in retry_request
return await func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 384, in _request
validate_response(status, contents, path, args)
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 102, in validate_response
raise HttpError(error)
gcsfs.retry.HttpError: Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.9/threading.py", line 910, in run
self._target(*self._args, **self._kwargs)
File "/pipeline/training/train.py", line 133, in training
X.to_csv(file_name, index=False)
File "/usr/local/lib/python3.9/site-packages/pandas/core/generic.py", line 3563, in to_csv
return DataFrameRenderer(formatter).to_csv(
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/format.py", line 1180, in to_csv
csv_formatter.save()
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 261, in save
self._save()
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 266, in _save
self._save_body()
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 304, in _save_body
self._save_chunk(start_i, end_i)
File "/usr/local/lib/python3.9/site-packages/pandas/io/formats/csvs.py", line 315, in _save_chunk
libwriters.write_csv_rows(
File "pandas/_libs/writers.pyx", line 72, in pandas._libs.writers.write_csv_rows
File "/usr/local/lib/python3.9/site-packages/fsspec/spec.py", line 1491, in write
self.flush()
File "/usr/local/lib/python3.9/site-packages/fsspec/spec.py", line 1527, in flush
self._initiate_upload()
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 1443, in _initiate_upload
self.location = sync(
File "/usr/local/lib/python3.9/site-packages/fsspec/asyn.py", line 96, in sync
raise return_result
File "/usr/local/lib/python3.9/site-packages/fsspec/asyn.py", line 53, in _runner
result[0] = await coro
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 1559, in initiate_upload
headers, _ = await fs._call(
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 392, in _call
status, headers, info, contents = await self._request(
File "/usr/local/lib/python3.9/site-packages/decorator.py", line 221, in fun
return await caller(func, *(extras + args), **kw)
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 152, in retry_request
raise e
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 115, in retry_request
return await func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/gcsfs/core.py", line 384, in _request
validate_response(status, contents, path, args)
File "/usr/local/lib/python3.9/site-packages/gcsfs/retry.py", line 102, in validate_response
raise HttpError(error)
gcsfs.retry.HttpError: Anonymous caller does not have storage.objects.create access to the Google Cloud Storage bucket., 401
I also tried running the application by making the google cloud bucket public.我还尝试通过公开谷歌云存储桶来运行该应用程序。 With this approach, the read and write operations to google cloud bucket are working.使用这种方法,对谷歌云存储桶的读写操作是有效的。
The problem arises when the google cloud bucket is private (which is essential for application deployment).当谷歌云存储桶是私有的(这对于应用程序部署至关重要)时,就会出现问题。
Any help to resolve this error will be appreciated.任何解决此错误的帮助将不胜感激。 Thanks in advance!!提前致谢!!
You are able to read/write from local system because you might you be using your credential or impersonating the SA that has permission to access the private bucket.您可以从本地系统读取/写入,因为您可能正在使用您的凭证或模拟有权访问私有存储桶的 SA。 FYI - if you are access bucket cross-project then the SA should be granted required permission in the the project bucket is in.仅供参考 - 如果您是跨项目访问存储桶,则应在项目存储桶所在的项目中授予 SA 所需的权限。
One thing you can do is grant the SA that you are using to run the gke pod
required permission (instead of explicitly setting the credentials GOOGLE_APPLICATION_CREDENTIALS
) and can access the credentials with google.auth.default()
wherever needed.您可以做的一件事是授予您用于运行gke pod
所需权限的 SA(而不是显式设置凭据GOOGLE_APPLICATION_CREDENTIALS
),并且可以在需要时使用google.auth.default()
访问凭据。
PS: If the SA running your gke pod
has storage access permission in project the bucket you are trying to access is in, then you should be just fine. PS:如果运行您的gke pod
的 SA 在您尝试访问的存储桶所在的项目中具有存储访问权限,那么您应该没问题。
Hope this helps:)希望这可以帮助:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.