简体   繁体   English

kubernetes 上的 Jenkins 动态从站 - 构建时间非常长

[英]Jenkins dynamic slaves on kubernetes - Very high build time

Background - Migrating a project from CircleCI to Jenkins.背景 - 将项目从 CircleCI 迁移到 Jenkins。

Project technology - typescript (node.js)项目技术——typescript(node.js)

I have deployed a Jenkins on a newly baked GKE cluster using the Jenkins official helm chart and leveraging the benefits of dynamic slaves.我使用 Jenkins 官方 helm 图表并利用动态从站的优势在新出炉的 GKE 集群上部署了 Jenkins。

The issue I am facing is with one of my application, it is a group of 4 microservice which build and deployed together as a project.我面临的问题是我的一个应用程序,它是一组 4 个微服务,它们作为一个项目一起构建和部署。

Since all apps build and ship together I have set up a Jenkins parallel build pipeline that pulls the repo and builds all the applications in parallel to save the build time(copied the same logic from the existing CircleCI setup).由于所有应用程序一起构建和发布,我已经建立了一个 Jenkins 并行构建管道,该管道提取存储库并并行构建所有应用程序以节省构建时间(从现有的 CircleCI 设置复制相同的逻辑)。

In CircleCI it normally takes five to seven minutes to build the app whereas in Jenkins it is taking more than 20 minutes.在 CircleCI 中,构建应用程序通常需要 5 到 7 分钟,而在 Jenkins 中则需要 20 多分钟。

I doubt I have the limitation of the resources on the node and increased to a very high spec node and then monitored using the kubectl top pods command and I notice it never reaches more than 3 CPU during the entire build process.我怀疑我有节点上的资源限制并增加到一个非常高规格的节点,然后使用kubectl top pods命令进行监控,我注意到它在整个构建过程中从未达到超过 3 个 CPU。

For further debugging, I thought it could be the IOPS issues as the project is pulling a lot of node modules and I have changed the node disk to SSD for testing but no luck.为了进一步调试,我认为这可能是 IOPS 问题,因为该项目正在提取大量节点模块,并且我已将节点磁盘更改为 SSD 进行测试,但没有运气。

For further debugging, I started provisioning a dynamic PV with every slave that Jenkins Sprawns and no luck again.为了进一步调试,我开始为 Jenkins 生成的每个从站配置一个动态 PV,但又没有运气了。

I am not sure what I am missing I checked the docker stats, Kubernetes logs but everything looks normal.我不确定我错过了什么我检查了 docker 统计数据,Kubernetes 日志,但一切看起来都很正常。

I am ruining docker build like this(4 different applications):我正在破坏 docker 这样的构建(4 个不同的应用程序):

docker build --build-arg NODE_HEAP_SIZE=8096 --build-arg NPM_TOKEN=$NPM_TOKEN -f "test/Dockerfile" -t "test:123"

This is how my Dockerfile looks like:这就是我的 Dockerfile 的样子:

FROM node:10.19.0 AS node
WORKDIR /etc/xyz/test

COPY --from=gcr.io/berglas/berglas:0.5.0 /bin/berglas /usr/local/bin/berglas
COPY docker-entrypoint.sh /

ENTRYPOINT ["/docker-entrypoint.sh"]

#
# development stage used in conjunction with docker-compose for local development
#

FROM node AS dev
ENV NODE_ENV="development"


COPY new/package.json new/package-lock.json ../new/
RUN (cd ../new && npm install)

COPY brand/package.json brand/package-lock.json ../brand/
RUN (cd ../brand && npm install)

COPY chain/package.json chain/package-lock.json ./
RUN npm install

COPY chain ./
COPY new ../new/
COPY brand ../brand/

#
# production stage that compiles and runs production artifacts
#

FROM dev AS prod
ENV NODE_ENV="production"

ARG NODE_HEAP_SIZE="4096"
RUN NODE_OPTIONS="--max-old-space-size=${NODE_HEAP_SIZE}" npm run build:prod

To verify the network bandwidth on the nodes I have started a ubuntu container and did the network test and it is up to the mark.为了验证节点上的网络带宽,我启动了一个 ubuntu 容器并进行了网络测试,结果达到了标准。

I even tried passing the -cache-from to improve the caching during the build but no luck here as well.我什至尝试通过-cache-from来改进构建期间的缓存,但这里也没有运气。

I have even tried changing the NODE_HEAP_SIZE to a very high value but did not get any improvements.我什至尝试将NODE_HEAP_SIZE更改为非常高的值,但没有得到任何改进。

I have seen the maximum time is going in npm install or npm ci or npm run build我已经看到npm installnpm cinpm run build中的最长时间

Adding further investigation:添加进一步调查:

I have tried building the same steps on VM and also by spinning up a docker container on the same VM and tried to run the docker build inside, it is taking significantly less time than running in Jenkins dynamic slaves.我已经尝试在 VM 上构建相同的步骤,并且还通过在同一 VM 上启动 docker 容器并尝试在内部运行 docker 构建,它所花费的时间比在 Z2E54334C0A5CE2E3E3E5A5845DF3AB3 中运行要少得多。 The time difference is more or less double on dynamic slaves.动态从站上的时间差或多或少是两倍。

The maximum time is going in npm install and npm ci steps.最大时间在npm installnpm ci步骤中。

I don't know understand how CircleCi is able to build it faster.我不知道 CircleCi 如何能够更快地构建它。

Can someone help me with what else should I debug?有人可以帮我调试什么吗?

Without checking logs it is hard to say what is happening in Jenkins.如果不检查日志,很难说 Jenkins 中发生了什么。 Please take a look this article about global jenkins logs and configuring additional log recorders.请查看这篇关于全局 jenkins 日志和配置其他日志记录器的文章。

I had similar problem with dynamic jenkins slaves in AWS, because "Amazon EC2" plugin's developers changed security settings and it took ~15 minutes for checking ssh keys.我对 AWS 中的动态 jenkins 从站有类似的问题,因为“Amazon EC2”插件的开发人员更改了安全设置,检查 ssh 密钥需要大约 15 分钟。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM