简体   繁体   English

在 Jenkins 管道上缓存 NPM 依赖项

[英]Cache NPM dependencies on Jenkins pipeline

We all know that downloading dependencies with npm can be very time consuming, specially when we are limited to old npm versions.我们都知道使用 npm 下载依赖项可能非常耗时,尤其是当我们仅限于旧的 npm 版本时。

For me, as a developer, this wasn't such a big deal because I had to do this very few times on my local development machine and everything worked with the node_modules cache in my project's folder.对我而言,作为一名开发人员,这并不是什么大问题,因为我必须在本地开发机器上执行此操作很少,并且所有内容都与项目文件夹中的 node_modules 缓存一起使用。 But now I want to take this the applications to a CI environment, with Jenkins.但是现在我想通过 Jenkins 将这些应用程序带到 CI 环境中。

I realized a huge ammount of time was spent on downloading dependencies with npm.我意识到使用 npm 下载依赖项花费了大量时间。 This is a problem because:这是一个问题,因为:

  1. npm downloads the dependencies in the project's folder, not a global folder such as Maven's /home/user/.m2 npm 下载项目所在文件夹中的依赖,而不是Maven 的/home/user/.m2 等全局文件夹

  2. I have to clean up the Jenkins workspace folder in every run to avoid issues with the git checkout.我必须在每次运行时清理 Jenkins 工作区文件夹,以避免出现 git checkout 问题。

I want a very elegant solution for caching the npm dependencies on my Jenkins slaves, but so far I can only think of:我想要一个非常优雅的解决方案来缓存我的 Jenkins slaves 上的 npm 依赖项,但到目前为止我只能想到:

  1. Removing everything but the node_modules folders from the Jenkins workspace.从 Jenkins 工作区中删除除 node_modules 文件夹之外的所有内容。 I don't like this because I could consume lots of HDD if I keep creating branches for my project.我不喜欢这样,因为如果我继续为我的项目创建分支,我可能会消耗大量 HDD。 Each branch creates a workspace.每个分支创建一个工作区。

  2. doing something like cp ./node_modules /home/npm_cache after every npm install and then cp /home/npm_cache ./node_modules after the code checkout.在每次 npm 安装后执行cp ./node_modules /home/npm_cache ,然后在代码检出后执行cp /home/npm_cache ./node_modules

I feel these solutions are terrible.我觉得这些解决方案很糟糕。 There must be a better way to do this.必须有更好的方法来做到这一点。

What I have done in my Jenkins pipeline for 3 different projects is using tar instead of cp and then npm install instead of npm ci , for each:我在我的 Jenkins 管道中为 3 个不同的项目所做的是使用tar而不是cp然后npm install而不是npm ci ,对于每个:

  1. cd to your project cd到你的项目
  2. npm i
  3. tar cvfz ${HOME}/your_project_node_modules.tar.gz node_modules

Then in the pipeline:然后在管道中:

dir(your_project){
  sh "tar xf ${HOME}/your_project_node_modules.tar.gz"
  sh "npm i"
}

Of course it has the disadvantage that with time dependencies change and the install will take longer, but I've managed to reduce my disk space usage in the image by about 0.5GB and tar is much faster then cp ( cp ~30 sec, tar ~5 sec)当然,它的缺点是随着时间依赖性的变化,安装将花费更长的时间,但是我设法将映像中的磁盘空间使用量减少了大约 0.5GB,并且tarcp快得多( cp ~30 秒, tar ~5 秒)

Total install time went in my case from about 3 minutes to a matter of seconds.在我的情况下,总安装时间从大约 3 分钟到几秒钟。

NPM 有一个全局缓存存储在~/.npm

Those parts of the Jenkinsfile will do the following: Jenkinsfile 的这些部分将执行以下操作:

On Branch master and develop, a fresh npm install is always executed.在 Branch master 和 develop 上,总是执行全新的 npm install。

On all other branches, the package.json will be md5 hashed and after the npm install the node_modules folder will be placed in the defined cache folder like: <CACHE_DIRECTORY>/<MD5_SUM_PACKAGE_JSON>/node_modules.在所有其他分支上,package.json 将被 md5 散列,并且在 npm install 之后 node_modules 文件夹将放置在定义的缓存文件夹中,如:<CACHE_DIRECTORY>/<MD5_SUM_PACKAGE_JSON>/node_modules。

The next build can reuse the node_modules and doesn't have to download all the node_modules again.下一次构建可以重用 node_modules,而不必再次下载所有 node_modules。

parameters {
    booleanParam(name: "CACHED_NODE_MODULES",
            description: "Should node_modules be taken from cache?",
            defaultValue: !'master'.equals(env.BRANCH_NAME) && !'develop'.equals(env.BRANCH_NAME))
}

... ...

stage('Build') {
   steps {
      cacheOrRestoreNodeModules()
      echo "Performing npm build..."
      sh 'npm install'

   }
}

... ...

def cacheOrRestoreNodeModules() {
if (params.CACHED_NODE_MODULES) {
    sh '''
    MD5_SUM_PACKAGE_JSON=($(md5sum package.json))
    CACHE_FOLDER=/home/jenkins/.cache/npm/${MD5_SUM_PACKAGE_JSON}
    
    # check if folder exists and copy node_modules to current directory
    if [ -d ${CACHE_FOLDER} ]; then
      cp -r ${CACHE_FOLDER}/node_modules .
    fi
    
    npm install --no-audit
    
    # if folder does not exists, create it and cache node_modules folder
    if ! [ -d ${CACHE_FOLDER} ]; then
      mkdir -p ${CACHE_FOLDER}
      cp -r node_modules ${CACHE_FOLDER}/node_modules
    fi
    '''
}

} }

I dont' know node.js enough to know how to handle this on that side.我不知道 node.js 足以知道如何在那边处理这个。 But one simple way this could be handled on a Linux machine is to simply symlink the the cache directory to an external location right after you checkout from git.但是,可以在 Linux 机器上处理这种情况的一种简单方法是在您从 git 结帐后立即将缓存目录符号链接到外部位置。 Each agent machine will maintain its own cache, but you would probably have to do that regardless of the solution.每个代理机器都将维护自己的缓存,但无论解决方案如何,您都可能不得不这样做。

I assume you have investigated the nodeJS plugin, and it can't do what you want.我假设您已经调查了 nodeJS 插件,但它不能做您想做的事。

I created such script to check md5sum of package.json in Jenkins:我创建了这样的脚本来检查 Jenkins 中 package.json 的 md5sum:

stage('NPM Build') {
  steps {
    sh '''
    node -v && npm -v
    '''
    // rm -rf node_modules
    sh '''
    CACHE_FOLDER=${HOME}/.cache/md5
    echo "EXECUTOR_NUMBER: ${EXECUTOR_NUMBER}"
    MD5_FILE_NAME=package-json_${EXECUTOR_NUMBER}.md5sum

    [ -d ${CACHE_FOLDER} ] || mkdir -p ${CACHE_FOLDER}
    ls ${CACHE_FOLDER}

    if [ -f ${CACHE_FOLDER}/${MD5_FILE_NAME} ];then
      cp ${CACHE_FOLDER}/${MD5_FILE_NAME} ${MD5_FILE_NAME}
      md5sum package.json
      cat ${MD5_FILE_NAME}
      md5sum -c ${MD5_FILE_NAME} || npm ci
    else
      echo "No md5sum backup"
      npm ci
    fi

    echo "create new md5sum backup"
    md5sum package.json
    md5sum package.json > ${MD5_FILE_NAME}
    cp ${MD5_FILE_NAME} ${CACHE_FOLDER}
    '''
    sh '''
    npm run ngcc
    '''
    sh '''
    npm run build
    '''
  }
}

I have chosen to run every build in a fresh docker container, but dependencies caching can still be done.我选择在新的 docker 容器中运行每个构建,但仍然可以完成依赖项缓存。 This is what I have done:这就是我所做的:

  • Each project has a cache for npm packages, which are zipped they are zipped in a file containing the node_modules folder.每个项目都有一个 npm 包的缓存,这些包被压缩到一个包含node_modules文件夹的文件中。 These zip are all stored in /home/.cache/node_modules folder inside the host (the node where the build is run).这些 zip 文件都存储在主机(运行构建的节点)内的/home/.cache/node_modules文件夹中。 So, when starting the docker container, it must have a bind mount like所以,当启动 docker 容器时,它必须有一个像
docker { 
    image dockerImage
    args "... -v \"/home/.cache/node_modules:/home/.cache/node_modules\""
}
  • I am using a shared library with a custom step for building, its implementation is more or less this one:我正在使用一个带有自定义构建步骤的共享库,它的实现或多或少是这样的:
sh """#!/bin/bash -xe
    function getNodeModulesListHash {
        npm ls 2> /dev/null | md5sum | cut -d ' ' -f 1
    }
    
    frontendProjectHashZip="\$(echo "${project}" | md5sum | cut -d ' '  -f 1).tar"
    [[ -f "/home/.cache/node_modules/\$frontendProjectHashZip" ]] && tar -xf "/home/.cache/node_modules/\$frontendProjectHashZip"

    hashBeforeInstall="\$(getNodeModulesListHash)"
    npm install
    hashAfterInstall="\$(getNodeModulesListHash)"

    if [[ \$hashBeforeInstall != \$hashAfterInstall ]]
    then 
        tar -cf \$frontendProjectHashZip node_modules
        rm -f "/home/.cache/node_modules/\$frontendProjectHashZip"
        mv \$frontendProjectHashZip "/home/.cache/node_modules/\$frontendProjectHashZip"
    fi
"""

The getNodeModulesListHash is used to get the hash of the currently installed packages. getNodeModulesListHash用于获取当前安装的包的哈希值。 This hash is computed before and after the npm install so that if their value is the same, then I do not need to recreate the zip file with node_modules but I can keep the one that I have initially extracted.这个散列是在npm install之前和之后计算的,所以如果它们的值相同,那么我不需要用node_modules重新创建 zip 文件,但我可以保留我最初提取的那个。 The rest is pretty straightforward and the logic is very similar to what other users proposed.其余的非常简单,逻辑与其他用户提出的非常相似。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Docker Jenkins Pipeline 安装全局 NPM 包 - Docker Jenkins Pipeline install global NPM packages 在 Jenkins 管道中执行 npm-script(带变量) - Execute an npm-script (with variables) in a Jenkins pipeline 在 Z30136395F01879792198317C1EAZ1 上的 Jenkins 管道中使用 npm 构建 ReactJs 应用程序 - build a ReactJs app with npm in a Jenkins Pipeline on Kubernetes Jenkins管道中的npm安装不起作用(Webpack-Project) - npm install in Jenkins Pipeline not Working (Webpack-Project) 编写 Jenkins 管道共享库以发布到 Nexus NPM 存储库 - Writing a Jenkins Pipeline Shared Library to publish to Nexus NPM repository 使用npm cache add时可以包含依赖项吗? - Can dependencies be included when using npm cache add? Jenkins在构建时是否缓存依赖项和Docker层? - Does Jenkins cache dependencies and Docker layers when building? 依赖共享的 NPM 依赖 - NPM dependencies shared by dependencies npm install 在 Jenkins 管道中失败,即使我配置了 npm 步骤以使用 custom.nrpmrc 文件 - npm install fails in Jenkins pipeline even though I configured withNpm step to use a custom .nrpmrc file `npm install`通过网络安装我项目的所有依赖项,即使它们已经安装或可以从缓存中获取 - `npm install` installs all dependencies of my project over the network, even if they are already installed or available from cache
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM