Dockerfile RUN 层与脚本

Question

Docker version 19.03.12 , build 48a66213fe Docker 版本19.03.12 , build 48a66213fe

So in a dockerfile, if I have the following lines:所以在 dockerfile 中，如果我有以下几行：

RUN yum install aaa \
        bbb \
        ccc && \
        <some cmd> && \
        <etc> && \
         <some cleanup>

is that a best practice?这是最佳做法吗？ Should I keep yum part separate than when I call other <commands/scripts> ?我应该将yum部分与调用其他 <commands/scripts> 时分开吗？

If I want a cleaner (vs traceable) Dockerfile, what if I put those lines in a.sh script can just call that script (ie COPY followed by a RUN statement).如果我想要一个更清洁的（相对于可追踪的）Dockerfile，如果我将这些行放在 a.sh 脚本中，可以调用该脚本（即 COPY 后跟一个 RUN 语句）。 Will the build step run each time, even though nothing is changes inside.sh script**?** Looking for some gotchas here.构建步骤是否每次都会运行，即使 .sh 脚本内没有任何变化**？** 在这里寻找一些陷阱。

I'm thinking, whatever packages are stable, have a separate RUN <those packages> ie in one layer and lines which depend upon / change frequently ie may use user-defined (docker build time CLI level args) keep those in separate RUN layer (so I can use layer cache effectively).我在想，无论包是稳定的，都有一个单独的RUN <those packages>即在一层中，并且依赖于/经常更改的行，即可以使用用户定义的（docker build time CLI level args）将它们保持在单独的 RUN 层（所以我可以有效地使用图层缓存）。

Wondering if you think keeping a cleaner Dockerfile (calling RUN some.sh) would be less efficient than a traceable Dockerfile (where everything is listed in Dockerfile what makes that image).想知道您是否认为保持清洁 Dockerfile（调用 RUN some.sh）的效率低于可追溯的 Dockerfile（所有内容都列在 Z3254677A7917C6C01F55212F86C7 中）

Thanks.谢谢。

Answer 1

I guess the question is somewhat opinion based.我想这个问题有点基于意见。

It depends on what you are after.这取决于你追求什么。 It's ultimately a tradeoff between development experience and an optimized image.这最终是开发经验和优化图像之间的权衡。

If you put everything in on RUN instruction, you are reducing the number of layers and therefore the image size to some degree.如果你把所有东西都放在 RUN 指令上，你会在一定程度上减少层数，从而减少图像大小。 Also, each layer is stored in the registry, so pushing and pulling would get more time-consuming and expensive.此外，每一层都存储在注册表中，因此推送和拉取会变得更加耗时和昂贵。 On the other hand, it means that each small change causes everything in the RUN instruction to run again, as it invalidates the cache for that single layer.另一方面，这意味着每个小的更改都会导致 RUN 指令中的所有内容再次运行，因为它会使该单层的缓存无效。

If you are creating temporary files with a RUN instruction that are removed by a later RUN instruction, then it would be better to run both commands in a single instruction to not create a layer with temporary files.如果您正在使用 RUN 指令创建临时文件，而该指令被稍后的 RUN 指令删除，那么最好在一条指令中运行这两个命令，以免创建具有临时文件的层。

For a production image, I would opt for a single RUN instruction as optimization is more important than build speed and caching, IMO.对于生产映像，我会选择单个 RUN 指令，因为优化比构建速度和缓存更重要，IMO。 If you can, you could also use multi staging, where the first stage uses an individual RUN instruction to utilize the layer caching.如果可以，您还可以使用多阶段，其中第一阶段使用单独的 RUN 指令来利用层缓存。 In the second stage, some artefacts from the first stage are taken and the number of layers is aggressively kept at a minimum.在第二阶段，从第一阶段获取一些人工制品，并积极地将层数保持在最低限度。 Only the final stage will be pushed and pulled from a registry.只有最后阶段才会从注册表中推送和拉取。

For example, in the below image, the builder stage is using more instructions than strictly required to gain better caching.例如，在下图中，构建器阶段使用了比严格要求更多的指令来获得更好的缓存。 Even The template file is copied into the first stage, even though it's not used at all there, since it's only read and used at runtime.即使模板文件被复制到第一阶段，即使它根本没有被使用，因为它只是在运行时读取和使用。 But this way the final stage can get the output binary and the template with a single COPY instruction.但是这样最后阶段可以通过一条 COPY 指令获得 output 二进制文件和模板。

FROM golang as builder
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY *.go /src/
RUN mkdir -p /dist/templates
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -o /dist/run .
COPY haproxy.cfg.template /dist/templates/

FROM alpine
WORKDIR /mananger
COPY --from=builder /dist ./
ENTRYPOINT ["./run"]

In terms of script vs RUN instruction, I think it is more idiomatic to use a RUN instruction and concatenate multiple commands with the double ampersand && .就脚本与 RUN 指令而言，我认为使用 RUN 指令并将多个命令与双&&连接起来更为惯用。 If things get very complex, then it may be better to use a dedicated script to make better use of shell syntax/features.如果事情变得非常复杂，那么最好使用专用脚本来更好地利用 shell 语法/功能。 It depends on what you are doing there.这取决于你在那里做什么。

Will the build step run each time, even though nothing is changes inside.sh script**?**构建步骤是否每次都会运行，即使 .sh 脚本内没有任何变化**？**

The build step would only run once and get cached.构建步骤只会运行一次并被缓存。 As long as the content of the script would not change, docker would use the cached layer.只要脚本的内容不变，docker 就会使用缓存层。 You need to get the file somehow into the image to run beforehand, so I guess the real cache invalidation would already happen in the COPY instruction, if the file has changed.您需要以某种方式将文件放入图像中以预先运行，所以我猜如果文件已更改，则真正的缓存失效已经发生在 COPY 指令中。

As mentioned in the previous paragraph, using a script will cost you at minium 1 COPY or ADD instruction more, introducing an additional layer that could have been avoided, if a RUN instruction had been used.如上一段所述，使用脚本将花费您最少 1 次 COPY 或 ADD 指令，如果使用了 RUN 指令，则会引入一个本可以避免的附加层。

Answer 2

In terms of the final image filesystem, you will notice no difference if you RUN the commands directly, or RUN a script, or have multiple RUN commands.就最终的映像文件系统而言，如果您直接RUN命令、 RUN脚本或有多个RUN命令，您将不会注意到任何区别。 The number of layers and the size of the command string doesn't really make any difference at all.层数和命令字符串的大小根本没有任何区别。

What can you observe?你能观察到什么？

Particularly on the "classic" Docker build system, each RUN command becomes an image layer.特别是在“经典”Docker 构建系统上，每个RUN命令都成为一个镜像层。 In your example, you RUN yum install &&... && <some cleanup> ;在您的示例中，您RUN yum install &&... && <some cleanup> ; if this was split into multiple RUN commands then the un-cleaned-up content would be committed as part of the image and takes up space even though it's removed in a later layer.如果这被分成多个RUN命令，那么未清理的内容将作为图像的一部分提交并占用空间，即使它在后面的层中被删除。
"More layers" isn't necessarily bad on its own, unless you have so many layers that you hit an internal limit. “更多层”本身并不一定是坏事，除非你有太多层以至于你达到了内部限制。 The only real downside here is creating a layer with content that you're planning to delete, in which case its space will still be in the final image.这里唯一真正的缺点是创建一个包含您计划删除的内容的图层，在这种情况下，它的空间仍将在最终图像中。
As a more specific example of this, there's an occasional pattern where an image installs some development-only packages, runs an installation step, and uninstalls the packages.作为一个更具体的例子，有一个偶然的模式，一个镜像安装一些仅用于开发的包，运行一个安装步骤，然后卸载这些包。 An Alpine-based example might look like基于 Alpine 的示例可能看起来像
```
RUN apk add --virtual.build-deps \ gcc make \ && make \ && make install \ && apk del.build-deps
```
In this case you must run the "install" and "uninstall" in the same RUN command;在这种情况下，您必须在同一个RUN命令中运行“安装”和“卸载”； otherwise Docker will create a layer that includes the build-only packages.否则 Docker 将创建一个包含仅构建包的层。
(A multi-stage build may be an easier way to accomplish the same goal of needing build-only tools, but not including them in the final image.) （多阶段构建可能是一种更简单的方法来实现需要仅构建工具的相同目标，但不将它们包含在最终映像中。）
The actual text of the RUN command is visible in docker history and similar inspection commands. RUN命令的实际文本在docker history和类似的检查命令中可见。

And...that's kind of it.而且……就是这样。 If you think it's more maintainable to keep the installation steps in a separate script (maybe you have some way to use the same script in a non-Docker context) then go for it.如果您认为将安装步骤保存在单独的脚本中更易于维护（也许您有一些方法可以在非 Docker 上下文中使用相同的脚本），那么 go 就可以了。 I'd generally default to keeping the steps spelled out in RUN commands, and in general try to keep those setup steps as light-weight as possible.我通常默认保留在RUN命令中说明的步骤，并且通常尽量使这些设置步骤保持轻量级。

Dockerfile RUN 层与脚本

问题描述

2 个解决方案

解决方案1
1 2022-01-27 18:39:41

解决方案2
1 2022-01-27 18:48:46

Dockerfile RUN 层与脚本

问题描述

2 个解决方案

解决方案1 1 2022-01-27 18:39:41

解决方案2 1 2022-01-27 18:48:46

解决方案1
1 2022-01-27 18:39:41

解决方案2
1 2022-01-27 18:48:46