简体   繁体   中英

Google Cloud Build for Python app triggered docker build fails to pull pip requirement from private artefact registry

I have a google cloud build for a Python flask app that's triggered on pull requests to my git repo. I'm trying to add a private python package dependency that's stored on the google artefact registry. This works fine locally when I either copy a service account json into the docker container and point GOOGLE_APPLICATION_CREDENTIALS to it, but I don't want to commit the service key to github and would like to avoid the service key being in the container.

This is similar to this question but that is unanswered, only with a suggestion to use a short lived access token, but no details/documentation on how to integrate that into an automated cloud build trigger.

My Dockerfile looks like this:

WORKDIR $APP_HOME
COPY . ./

# Let Hive know what env it is running in
ARG HIVE_BUILD_ENV=UNSET
ENV HIVE_ENV=$HIVE_BUILD_ENV

# setup timezone for Ireland and magics library for file type detection
ENV DEBIAN_FRONTEND=noninteractive
ENV TZ="Europe/Dublin"
RUN apt-get update && apt-get install -y tzdata libmagic1


FROM base as staging

ENV DB_NAME=hive-staging

RUN pip install --upgrade pip
RUN pip install keyrings.google-artifactregistry-auth 
RUN pip install --no-cache-dir -r requirements.txt **# <<< build fails here**

CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 --timeout 0 "hive:create_app()"

My cloudbuild-preview.yaml start looks like this: (I can provide the whole file, but the build is failing on this step)

steps:
  - id: "build image"
    name: "gcr.io/cloud-builders/docker"
    args:
      [
        "build",
        "-t",
        "gcr.io/${PROJECT_ID}/${_SERVICE_NAME}:${_PR_NUMBER}-${SHORT_SHA}",
        ".",
        "--target",
        "staging",
        "--no-cache"
      ]

My requirements.txt start looks like this:

--index-url https://europe-west1-python.pkg.dev/hive-347910/hive-commons-art-repo/simple
--extra-index-url https://pypi.org/simple
hive-commons==0.0.6
Flask==2.2.2

When I try and run the build I get the following error output:

Step #0 - "build image": Step 15/16 : RUN pip install --no-cache-dir -r requirements.txt
Step #0 - "build image":  ---> Running in 1d17c8df7022
Step #0 - "build image": Looking in indexes: https://europe-west1-python.pkg.dev/hive-347910/hive-commons-art-repo/simple, https://pypi.org/simple
Step #0 - "build image": WARNING: Compute Engine Metadata server unavailable on attempt 1 of 3. Reason: timed out
Step #0 - "build image": WARNING: Compute Engine Metadata server unavailable on attempt 2 of 3. Reason: timed out
Step #0 - "build image": WARNING: Compute Engine Metadata server unavailable on attempt 3 of 3. Reason: timed out
Step #0 - "build image": WARNING: Authentication failed using Compute Engine authentication due to unavailable metadata server.
Step #0 - "build image": WARNING: Failed to retrieve Application Default Credentials: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Step #0 - "build image": WARNING: Trying to retrieve credentials from gcloud...
Step #0 - "build image": WARNING: Failed to retrieve credentials from gcloud: gcloud command exited with status: [Errno 2] No such file or directory: 'gcloud'
Step #0 - "build image": WARNING: Artifact Registry PyPI Keyring: No credentials could be found.
Step #0 - "build image": WARNING: Keyring is skipped due to an exception: Failed to find credentials, Please run: `gcloud auth application-default login or export GOOGLE_APPLICATION_CREDENTIALS=<path/to/service/account/key>`
Step #0 - "build image": User for europe-west1-python.pkg.dev: ERROR: Exception:
Step #0 - "build image": Traceback (most recent call last):
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 160, in exc_logging_wrapper
Step #0 - "build image":     status = run_func(*args)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/cli/req_command.py", line 247, in wrapper
Step #0 - "build image":     return func(self, options, args)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/commands/install.py", line 400, in run
Step #0 - "build image":     requirement_set = resolver.resolve(
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 92, in resolve
Step #0 - "build image":     result = self._result = resolver.resolve(
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_vendor/resolvelib/resolvers.py", line 481, in resolve
Step #0 - "build image":     state = resolution.resolve(requirements, max_rounds=max_rounds)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_vendor/resolvelib/resolvers.py", line 348, in resolve
Step #0 - "build image":     self._add_to_criteria(self.state.criteria, r, parent=None)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
Step #0 - "build image":     if not criterion.candidates:
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
Step #0 - "build image":     return bool(self._sequence)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 155, in __bool__
Step #0 - "build image":     return any(self)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 143, in <genexpr>
Step #0 - "build image":     return (c for c in iterator if id(c) not in self._incompatible_ids)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 44, in _iter_built
Step #0 - "build image":     for version, func in infos:
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 279, in iter_index_candidate_infos
Step #0 - "build image":     result = self._finder.find_best_candidate(
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/index/package_finder.py", line 889, in find_best_candidate
Step #0 - "build image":     candidates = self.find_all_candidates(project_name)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/index/package_finder.py", line 830, in find_all_candidates
Step #0 - "build image":     page_candidates = list(page_candidates_it)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/index/sources.py", line 134, in page_candidates
Step #0 - "build image":     yield from self._candidates_from_page(self._link)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/index/package_finder.py", line 790, in process_project_url
Step #0 - "build image":     index_response = self._link_collector.fetch_response(project_url)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/index/collector.py", line 461, in fetch_response
Step #0 - "build image":     return _get_index_content(location, session=self.session)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/index/collector.py", line 364, in _get_index_content
Step #0 - "build image":     resp = _get_simple_response(url, session=session)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/index/collector.py", line 135, in _get_simple_response
Step #0 - "build image":     resp = session.get(
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_vendor/requests/sessions.py", line 600, in get
Step #0 - "build image":     return self.request("GET", url, **kwargs)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/network/session.py", line 518, in request
Step #0 - "build image":     return super().request(method, url, *args, **kwargs)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_vendor/requests/sessions.py", line 587, in request
Step #0 - "build image":     resp = self.send(prep, **send_kwargs)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_vendor/requests/sessions.py", line 708, in send
Step #0 - "build image":     r = dispatch_hook("response", hooks, r, **kwargs)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_vendor/requests/hooks.py", line 30, in dispatch_hook
Step #0 - "build image":     _hook_data = hook(hook_data, **kwargs)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/network/auth.py", line 270, in handle_401
Step #0 - "build image":     username, password, save = self._prompt_for_password(parsed.netloc)
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/network/auth.py", line 233, in _prompt_for_password
Step #0 - "build image":     username = ask_input(f"User for {netloc}: ")
Step #0 - "build image":   File "/usr/local/lib/python3.8/site-packages/pip/_internal/utils/misc.py", line 204, in ask_input
Step #0 - "build image":     return input(message)
Step #0 - "build image": EOFError: EOF when reading a line
Step #0 - "build image": The command '/bin/sh -c pip install --no-cache-dir -r requirements.txt' returned a non-zero code: 2
Finished Step #0 - "build image"
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/docker" failed: step exited with non-zero status: 2
Step #0 - "build image": 

These two lines are the root of my problem:

WARNING: Authentication failed using Compute Engine authentication due to unavailable metadata server. WARNING: Failed to retrieve Application Default Credentials: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started

But looking through the docs linked in the error message I can't see a way to authenticate pip running inside a docker container build without service account json being there. I've seen other questions related to this problem where people suggested multistage builds, and that would work if I was running the builds from my dev machine, but I face the same problem of where does the first stage in a multi stage build pull the service account key from when it's only pulling from source as part of a automated cloud build.

Any advice on how to approach this problem would be appreciated.

EDIT: After contacting google support I was shown a way to perform this step without having to use the secret manager or service account json.

This page in the docs shows that if you add "-.network=cloudbuild" to the docker build step in your yaml file, it attaches the step's container to a local Docker.network named cloudbuild which has the ADC creds that pip requires to access the artefact registry. This solution is very simple and removes any requirement for a service account creds file. Much more secure and a less complex Dockerfile. So my yaml file now looks like this:

steps:
  - id: "build image"
    name: "gcr.io/cloud-builders/docker"
    args:
      [
        "build",
        "-t",
        "gcr.io/${PROJECT_ID}/${_SERVICE_NAME}:${_PR_NUMBER}-${SHORT_SHA}",
        ".",
        "--target",
        "staging",
        "--network=cloudbuild"
      ]

Original Solution: Found a solution for this that doesn't require committing the service account json to source control.

Cloud build can access secrets stored in the secret manager .

So I pass the service account json into the docker build command as a --build-arg and then I save it to a file and point GOOGLE_APPLICATION_CREDENTIALS to that location. I delete the file afterwards and unset the env variable.

Here's the relevant parts of my yaml file:

- id: "build image"
    name: "gcr.io/cloud-builders/docker"
    entrypoint: 'bash'
    args:
      [
        "-c",
        'docker build --tag gcr.io/${PROJECT_ID}/${_SERVICE_NAME}:${_PR_NUMBER}-${SHORT_SHA} --target staging --build-arg SERVICE_ACCOUNT_JSON="$$CREDS_JSON" --no-cache .'
      ]
    secretEnv: ['CREDS_JSON']
availableSecrets:
  secretManager:
    - versionName: projects/$PROJECT_ID/secrets/fake_secret_name/versions/latest
      env: 'CREDS_JSON'

And the parts of the Dockerfile that handle this:

ARG CREDS_JSON

WORKDIR $APP_HOME
RUN touch creds.json
RUN bash -c 'echo -E "$CREDS_JSON" >> ./creds.json'
ARG GOOGLE_APPLICATION_CREDENTIALS="$APP_HOME/creds.json"

RUN pip install --upgrade pip
RUN pip install keyrings.google-artifactregistry-auth
RUN pip install --no-cache-dir -r requirements.txt

RUN rm $APP_HOME/creds.json

In particular it's important to note the double quotes around the --build-arg and to use RUN bash -c 'echo -E "$CREDS_JSON" >>./creds.json' instead of the built in Docker command RUN echo to preserve the white space and carriage returns in the json file, otherwise the keyring package isn't able to process the cred file as a valid json file.

This solves my problem of accessing a pip requirement from the artefact repository without having to store credentials in source control but the service account creds json is still exposed in the history of the docker image build stages so I'm not 100% with this solution, so I'm looking at using a multi-stage build to further restrict that exposure. I'll also be rotating these credentials on a short time period.

Ideally I'd like the docker build to use credentials from the cloud build service account that's running the build and I've contacted google cloud support to see if that's possible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM