简体   繁体   中英

Building container from Windows (local) or Linux (AWS EC2) has different effects

I have been playing around with AWS Batch, and I am having some trouble understanding why everything work when I build a docker image from my local windows machine and push it to ECR, while it doesn't work when I do this from a ubuntu EC2 instance. What I show below is adapted from this tutorial .

The docker file is very simple:

FROM python:3.6.10-alpine
RUN apk add --no-cache --upgrade bash
COPY ./ /usr/local/aws_batch_tutorial
RUN pip3 install -r /usr/local/aws_batch_tutorial/requirements.txt
WORKDIR /usr/local/aws_batch_tutorial

Where the local folder contains the following bash script ( run_job.sh ):

#!/bin/bash

error_exit () {
  echo "${BASENAME} - ${1}" >&2
  exit 1
}

################################################################################
###### Convert envinronment variables to command line arguments ########

pat="--([^ ]+).+"
arg_list=""
while IFS= read -r line; do
    # Check if line contains a command line argument
    if [[ $line =~ $pat ]]; then
      E=${BASH_REMATCH[1]}
      # Check that a matching environmental variable is declared
      if [[ ! ${!E} == "" ]]; then
        # Make sure argument isn't already include in argument list
        if [[ ! ${arg_list} =~ "--${E}=" ]]; then
          # Add to argument list
          arg_list="${arg_list} --${E}=${!E}"
        fi
      fi
    fi
done < <(python3 script.py --help)

################################################################################
python3 -u script.py ${arg_list} | tee "${save_name}.txt"

aws s3 cp "./${save_name}.p" "s3://bucket/${save_name}.p" || error_exit "Failed to upload results to s3 bucket."
aws s3 cp "./${save_name}.txt" "s3://bucket/logs/${save_name}.txt" || error_exit "Failed to upload logs to s3 bucket."

It also contains a requirement.txt file with a three packages ( awscli , boto3 , botocore ), and a dummy python script ( script.py ) that simply lists the files in a s3 bucket and saves the list in a file that is then uploaded to s3.

Both in my local windows environment and in the EC2 instance I have set up my AWS credentials with aws configure , and in both cases I can successfully build the image, tag it and push it to ECR. The problem arises when I submit the job on AWS Batch, which should run the ECR container using the command ["./run_job.sh"] :

  • if AWS Batch uses the ECR image pushed from windows, everything works fine
  • if it uses the image pushed from ec2 linux, the job fails, and the only info I can get is this:

Status reason: Task failed to start

I was wondering if anyone has any idea of what might be causing the error.

I think I fixed the problem. The run_job.sh script in the docker image has to have execute permissions to be run by AWS Batch (but I think this is true in general). For some reason, when the image is built from Windows, the script has this permission, but it doesn't when the image is built from linux (aws ec2 - ubuntu instance). I fixed the problem by adding the following line in the Dockerfile:

 RUN chmod u+x run_job.sh

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM