简体   繁体   English

在 AWS ECS 容器中运行节点应用程序的问题

[英]Issue with running Node application in AWS ECS container

I have created a Docker image and pushed it to the AWS ECR repository我创建了一个 Docker 图像并将其推送到 AWS ECR 存储库

I'm creating a task with 3 containers included, one for Redis one for PostgreSQL and another one for the given Image which is my Node project我正在创建一个包含 3 个容器的任务,一个用于 Redis,一个用于 PostgreSQL,另一个用于给定的图像,这是我的节点项目

In Dockerfile, I have added a CMD to run the App with node command, here is the Dockerfile content:在Dockerfile中,我添加了一个CMD来运行App with node命令,这里是Dockerfile的内容:

FROM node:16-alpine as build

WORKDIR /usr/token-manager/app

COPY package*.json .

RUN npm install

COPY . .

RUN npm run build

FROM node:16-alpine as production

ARG ENV_ARG=production
ENV NODE_ENV=${ENV_ARG}

WORKDIR /usr/token-manager/app

COPY package*.json .

RUN npm install --production

COPY --from=build /usr/token-manager/app/dist ./dist

CMD ["node", "./dist/index.js"]

This image is working in a docker-compose locally without any issue此图像在 docker-compose 本地工作,没有任何问题

The issue is when I run the task in ECS Cluster it's not running the Node project, it seems that it's not running the CMD command问题是当我在 ECS 集群中运行任务时它没有运行节点项目,它似乎没有运行 CMD 命令

I tried to override that CMD command by adding a new command to the Task definition:我试图通过向任务定义添加一个新命令来覆盖该 CMD 命令:

在此处输入图像描述

When I run task with this command, there is nothing in the CloudWatch log and obviously the Node App is not running, here you can see that there is no log for api-container :当我使用此命令运行任务时,CloudWatch 日志中没有任何内容,显然 Node App 没有运行,在这里您可以看到没有api-container日志:

在此处输入图像描述

When I change the command to something else, for example "ls" it gets executed and I can see the result in CloudWatch log:当我将命令更改为其他命令时,例如“ls”,它会被执行,我可以在 CloudWatch 日志中看到结果:

在此处输入图像描述

在此处输入图像描述

or when I change it to a wrong command, I get an error in the log:或者当我将其更改为错误的命令时,我在日志中收到错误消息:

在此处输入图像描述

But When I change it to the right command which should run the App, nothing happens, it's not even showing anything in the log as error但是当我将它更改为应该运行应用程序的正确命令时,什么也没有发生,它甚至没有在日志中显示任何错误

I have added inbound rules to allow the port number needed for connecting to the App but it seems it's not running at all!我添加了入站规则以允许连接到应用程序所需的端口号,但它似乎根本没有运行!

What should I do?我应该怎么办? How can I check to see what is the issue?我怎样才能检查出问题是什么?

UPDATE : I changed the App Container configuration to make it Essential , it means that the whole Task will fail and stop if this container exits with any error, then I started the Task again and it gets stopped, so now I'm sure that the App Container is crashing and exiting some how but there is nothing in the log!更新:我更改了 App Container 配置以使其成为Essential ,这意味着如果此容器因任何错误退出,整个任务将失败并停止,然后我再次启动任务并停止,所以现在我确定App Container 正在崩溃并以某种方式退出,但日志中没有任何内容!

First: Make Sure your Docker image in deployed to ECR(you can using Codepipeline) because that is where the ECS will look for the DockerImage.首先:确保您的 Docker 映像已部署到 ECR(您可以使用 Codepipeline),因为 ECS 将在此处查找 DockerImage。

Second:Please Specify your launch-Type, in case of Ec2 make sure you are using latest Node Image while adding container.第二:请指定您的启动类型,如果是 Ec2,请确保您在添加容器时使用最新的节点映像。 Here you can find latest Docker Image for Node: https://hub.docker.com/_/node在这里您可以找到最新的 Docker 节点图像: https://hub.docker.com/_/node

Third: Create Task-Definition and Run the task, now make sure you navigate to cluster and check if task is running and check task status.第三:创建任务定义并运行任务,现在确保导航到集群并检查任务是否正在运行并检查任务状态。

Fourth: Make sure you allow all inbound traffic in Security group and open HTTP for 0.0.0.0/0第四:确保允许安全组中的所有入站流量并为 0.0.0.0/0 打开 HTTP

You can test using curl ie: http://ec2-52-38-113-251.us-west-2.compute.amazonaws.com您可以使用 curl 进行测试,即: http://ec2-52-38-113-251.us-west-2.compute.amazonaws.com

In case you failed to do so, i would recommend deploying simple Node App and get that running and then deploy your project.如果您没有这样做,我建议您部署简单的 Node App 并使其运行,然后再部署您的项目。 Thank you谢谢

I found the issue, I'll post it here, it may help someone else我发现了这个问题,我会在这里发布,它可能会帮助其他人

If you go to Cluster details screen > Tasks tab > Stopped > Task ID, then you can see a brief status message regarding each container in Containers list:如果您 go 到集群详细信息屏幕 > 任务选项卡 > 已停止 > 任务 ID,那么您可以在容器列表中看到有关每个容器的简短状态消息:

在此处输入图像描述

it saying that container killed due to Memory issue, we can fix it by increasing the memory we specify for containers when adding new Task Definition它说容器由于 Memory 问题而被杀死,我们可以通过增加我们在添加新任务定义时为容器指定的 memory 来修复它

This is the total amount of memory you want to give to the whole Task, which will be shared between all containers:这是你要给整个 Task 的 memory 的总量,它将在所有容器之间共享:

在此处输入图像描述

When you are adding new Container, there is a place for specifying the memory limit:添加新容器时,有一个地方可以指定 memory 限制:

在此处输入图像描述

Hard Limit : If you specify a Hard Limit, your container will get killed when attempt to exceed that limit of memory usage硬限制:如果您指定硬限制,当尝试超过 memory 使用限制时,您的容器将被杀死

Soft Limit : If you specify the Soft Limit, ECS will reserve that memory for your container, but your container can request more memory up to the Hard Limit Soft Limit :如果您指定 Soft Limit,ECS 将为您的容器保留 memory,但您的容器可以请求更多 memory 直到 Hard Limit

So the main point here is when there is some kind of Initial issue for container, there won't be any log in CloudWatch and when there is and issue but we didn't find anything in Log, then we should check possibilities like Memory or anything prevent container from being started所以这里的要点是当容器出现某种初始问题时,CloudWatch 中不会有任何日志,当出现问题但我们在日志中没有找到任何内容时,我们应该检查 Memory 或之类的可能性任何阻止容器启动的东西

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM