简体   繁体   English

在 Docker 中启动和填充 Postgres 容器

[英]Starting and populating a Postgres container in Docker

I have a Docker container that contains my Postgres database.我有一个 Docker 容器,其中包含我的 Postgres 数据库。 It's using the official Postgres image which has a CMD entry that starts the server on the main thread.它使用官方Postgres 映像,该映像具有在主线程上启动服务器的 CMD 条目。

I want to populate the database by running RUN psql –U postgres postgres < /dump/dump.sql before it starts listening to queries.我想通过在开始侦听查询之前运行RUN psql –U postgres postgres < /dump/dump.sql来填充数据库。

I don't understand how this is possible with Docker.我不明白 Docker 怎么可能做到这一点。 If I place the RUN command after CMD, it will of course never be reached because Docker has finished reading the Dockerfile.如果我将RUN命令放在 CMD 之后,它当然永远不会到达,因为 Docker 已完成读取 Dockerfile。 But if I place it before the CMD , it will run before psql even exists as a process.但是如果我把它放在CMD之前,它会在 psql 甚至作为一个进程存在之前运行。

How can I prepopulate a Postgres database in Docker?如何在 Docker 中预填充 Postgres 数据库?

After a lot of fighting, I have found a solution ;-)经过多次战斗,我找到了解决方案;-)

For me was very useful a comment posted here: https://registry.hub.docker.com/_/postgres/ from "justfalter"对我来说,这里发布的评论非常有用: https ://registry.hub.docker.com/_/postgres/ 来自“justfalter”

Anyway, I have done in this way:反正我是这样做的:

# Dockerfile
FROM postgres:9.4

RUN mkdir -p /tmp/psql_data/

COPY db/structure.sql /tmp/psql_data/
COPY scripts/init_docker_postgres.sh /docker-entrypoint-initdb.d/

db/structure.sql is a sql dump, useful to initialize the first tablespace. db/structure.sql是一个 sql 转储,用于初始化第一个表空间。

Then, the init_docker_postgres.sh然后, init_docker_postgres.sh

#!/bin/bash

# this script is run when the docker container is built
# it imports the base database structure and create the database for the tests

DATABASE_NAME="db_name"
DB_DUMP_LOCATION="/tmp/psql_data/structure.sql"

echo "*** CREATING DATABASE ***"

# create default database
gosu postgres postgres --single <<EOSQL
  CREATE DATABASE "$DATABASE_NAME";
  GRANT ALL PRIVILEGES ON DATABASE "$DATABASE_NAME" TO postgres;
EOSQL

# clean sql_dump - because I want to have a one-line command

# remove indentation
sed "s/^[ \t]*//" -i "$DB_DUMP_LOCATION"

# remove comments
sed '/^--/ d' -i "$DB_DUMP_LOCATION"

# remove new lines
sed ':a;N;$!ba;s/\n/ /g' -i "$DB_DUMP_LOCATION"

# remove other spaces
sed 's/  */ /g' -i "$DB_DUMP_LOCATION"

# remove firsts line spaces
sed 's/^ *//' -i "$DB_DUMP_LOCATION"

# append new line at the end (suggested by @Nicola Ferraro)
sed -e '$a\' -i "$DB_DUMP_LOCATION"

# import sql_dump
gosu postgres postgres --single "$DATABASE_NAME" < "$DB_DUMP_LOCATION";


echo "*** DATABASE CREATED! ***"

So finally:所以最后:

# no postgres is running
[myserver]# psql -h 127.0.0.1 -U postgres
psql: could not connect to server: Connection refused
    Is the server running on host "127.0.0.1" and accepting
    TCP/IP connections on port 5432?

[myserver]# docker build -t custom_psql .
[myserver]# docker run -d --name custom_psql_running -p 5432:5432 custom_psql

[myserver]# docker ps -a
CONTAINER ID        IMAGE                COMMAND                CREATED             STATUS              PORTS                    NAMES
ce4212697372        custom_psql:latest   "/docker-entrypoint.   9 minutes ago       Up 9 minutes        0.0.0.0:5432->5432/tcp   custom_psql_running

[myserver]# psql -h 127.0.0.1 -U postgres
psql (9.2.10, server 9.4.1)
WARNING: psql version 9.2, server version 9.4.
         Some psql features might not work.
Type "help" for help.

postgres=# 

# postgres is now initialized with the dump

Hope it helps!希望能帮助到你!

For those who want to initialize a PostgreSQL DB with millions of records during the first run.对于那些想在第一次运行时用数百万条记录初始化 PostgreSQL 数据库的人。

Import using *.sql dump使用 *.sql 转储导入

You can do simple sql dump and copy the dump.sql file into /docker-entrypoint-initdb.d/ .您可以执行简单的 sql 转储并将dump.sql文件复制到/docker-entrypoint-initdb.d/ The problem is speed .问题是速度 My dump.sql script is about 17MB (small DB - 10 tables with 100k rows in only one of them) and the initialization takes over a minute (!) .我的dump.sql脚本大约为 17MB(小型 DB - 10 个表,其中只有一个有 100k 行)并且初始化需要超过一分钟 (!) That is unacceptable for local development / unit test, etc.这对于本地开发/单元测试等是不可接受的。

Import using binary dump使用二进制转储导入

The solution is to make a binary PostgreSQL dump and use shell scripts initialization support .解决方案是制作二进制 PostgreSQL 转储并使用shell 脚本初始化支持 Then the same DB is initialized in about 500ms instead of 1 minute.然后在大约 500 毫秒而不是 1 分钟内初始化相同的数据库。

1. Create the dump.pgdata binary dump of a DB named "my-db" directly from within a container or your local DB 1.直接从容器或本地数据库中创建名为“my-db”的数据库的dump.pgdata二进制转储

pg_dump -U postgres --format custom my-db > "dump.pgdata"

Or from host from running container ( postgres-container )或者来自运行容器的主机( postgres-container

docker exec postgres-container pg_dump -U postgres --format custom my-db > "dump.pgdata"

2. Create a Docker image with a given dump and initialization script 2. 使用给定的转储和初始化脚本创建一个 Docker 镜像

$ tree
.
├── Dockerfile
└── docker-entrypoint-initdb.d
    ├── 01-restore.sh
    ├── 02-small-updates.sql
    └── dump.pgdata
$ cat Dockerfile
FROM postgres:11

COPY ./docker-entrypoint-initdb.d/ /docker-entrypoint-initdb.d/
$ cat docker-entrypoint-initdb.d/01-restore.sh
#!/bin/bash

file="/docker-entrypoint-initdb.d/dump.pgdata"
dbname=my-db

echo "Restoring DB using $file"
pg_restore -U postgres --dbname=$dbname --verbose --single-transaction < "$file" || exit 1
$ cat docker-entrypoint-initdb.d/02-small-updates.sql
-- some updates on your DB, for example for next application version
-- this file will be executed on DB during next release
UPDATE ... ;

3. Build an image and run it 3.构建镜像并运行

$ docker build -t db-test-img .
$ docker run -it --rm --name db-test db-test-img

Alternatively, you can just mount a volume to /docker-entrypoint-initdb.d/ that contains all your DDL scripts.或者,您可以将一个卷挂载到包含所有 DDL 脚本的 /docker-entrypoint-initdb.d/。 You can put in *.sh, *.sql, or *.sql.gz files and it will take care of executing those on start-up.您可以放入*.sh、*.sql 或 *.sql.gz文件,它会在启动时负责执行这些文件。

eg (assuming you have your scripts in /tmp/my_scripts)例如(假设你在 /tmp/my_scripts 中有你的脚本)

docker run -v /tmp/my_scripts:/docker-entrypoint-initdb.d postgres

There is yet another option available that utilises Flocker :还有另一种使用 Flocker 的选项:

Flocker is a container data volume manager that is designed to allow databases like PostgreSQL to easily run in containers in production. Flocker 是一个容器数据卷管理器,旨在允许像 PostgreSQL 这样的数据库在生产中的容器中轻松运行。 When running a database in production, you have to think about things like recovering from host failure.在生产中运行数据库时,您必须考虑诸如从主机故障中恢复之类的事情。 Flocker provides tools for managing data volumes across a cluster of machines like you have in a production environment. Flocker 提供工具来管理跨机器集群的数据量,就像在生产环境中一样。 For example, as a Postgres container is scheduled between hosts in response to server failure, Flocker can automatically move its associated data volume between hosts at the same time.例如,当 Postgres 容器在主机之间调度以响应服务器故障时,Flocker 可以同时在主机之间自动移动其关联的数据量。 This means that when your Postgres container starts up on a new host, it has its data.这意味着当你的 Postgres 容器在一个新主机上启动时,它有它的数据。 This operation can be accomplished manually using the Flocker API or CLI, or automatically by a container orchestration tool that Flocker is integrates with, for example Docker Swarm, Kubernetes or Mesos.此操作可以使用 Flocker API 或 CLI 手动完成,也可以通过与 Flocker 集成的容器编排工具自动完成,例如 Docker Swarm、Kubernetes 或 Mesos。

I Followed the same solution which @damoiser , The only situation which was different was I wanted to import all dump data.我遵循了与@damoiser 相同的解决方案,唯一不同的情况是我想导入所有转储数据。

Please follow the solution below.(I have not done any kind of checks)请按照下面的解决方案。(我没有做任何类型的检查)

Dockerfile文件

FROM postgres:9.5

RUN mkdir -p /tmp/psql_data/

COPY db/structure.sql /tmp/psql_data/
COPY scripts/init_docker_postgres.sh /docker-entrypoint-initdb.d/

then the docker-entrypoint-initdb.d script然后是 docker-entrypoint-initdb.d 脚本

#!/bin/bash

DB_DUMP_LOCATION="/tmp/psql_data/structure.sql"

echo "*** CREATING DATABASE ***"

psql -U postgres < "$DB_DUMP_LOCATION";

echo "*** DATABASE CREATED! ***"

and then you can build your image as然后你可以建立你的形象

docker build -t abhije***/postgres-data .

docker run -d abhije***/postgres-data 

My solution is inspired by Alex Dguez's answer which unfortunately doesn't work for me because:我的解决方案的灵感来自 Alex Dguez 的答案,不幸的是它对我不起作用,因为:

  1. I used pg-9.6 base image, and the RUN /docker-entrypoint.sh --help never ran through for me, which always complained with The command '/bin/sh -c /docker-entrypoint.sh -' returned a non-zero code: 1我使用了 pg-9.6 基础映像,而RUN /docker-entrypoint.sh --help从来没有为我运行过,它总是抱怨The command '/bin/sh -c /docker-entrypoint.sh -' returned a non-zero code: 1
  2. I don't want to pollute the /docker-entrypoint-initdb.d dir我不想污染/docker-entrypoint-initdb.d目录

The following answer is originally from my reply in another post: https://stackoverflow.com/a/59303962/4440427 .以下答案最初来自我在另一篇文章中的回复: https : //stackoverflow.com/a/59303962/4440427 It should be noted that the solution is for restoring from a binary dump instead of from a plain SQL as asked by the OP.应该注意的是,解决方案是从二进制转储中恢复,而不是从 OP 要求的普通 SQL 中恢复。 But it can be modified slightly to adapt to the plain SQL case但是可以稍微修改一下以适应普通的SQL情况

Dockerfile: Dockerfile:

FROM postgres:9.6.16-alpine

LABEL maintainer="lu@cobrainer.com"
LABEL org="Cobrainer GmbH"

ARG PG_POSTGRES_PWD=postgres
ARG DBUSER=someuser
ARG DBUSER_PWD=P@ssw0rd
ARG DBNAME=sampledb
ARG DB_DUMP_FILE=example.pg

ENV POSTGRES_DB launchpad
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD ${PG_POSTGRES_PWD}
ENV PGDATA /pgdata

COPY wait-for-pg-isready.sh /tmp/wait-for-pg-isready.sh
COPY ${DB_DUMP_FILE} /tmp/pgdump.pg

RUN set -e && \
    nohup bash -c "docker-entrypoint.sh postgres &" && \
    /tmp/wait-for-pg-isready.sh && \
    psql -U postgres -c "CREATE USER ${DBUSER} WITH SUPERUSER CREATEDB CREATEROLE ENCRYPTED PASSWORD '${DBUSER_PWD}';" && \
    psql -U ${DBUSER} -d ${POSTGRES_DB} -c "CREATE DATABASE ${DBNAME} TEMPLATE template0;" && \
    pg_restore -v --no-owner --role=${DBUSER} --exit-on-error -U ${DBUSER} -d ${DBNAME} /tmp/pgdump.pg && \
    psql -U postgres -c "ALTER USER ${DBUSER} WITH NOSUPERUSER;" && \
    rm -rf /tmp/pgdump.pg

HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
  CMD pg_isready -U postgres -d launchpad

where the wait-for-pg-isready.sh is: wait-for-pg-isready.sh在哪里:

#!/bin/bash
set -e

get_non_lo_ip() {
  local _ip _non_lo_ip _line _nl=$'\n'
  while IFS=$': \t' read -a _line ;do
    [ -z "${_line%inet}" ] &&
        _ip=${_line[${#_line[1]}>4?1:2]} &&
        [ "${_ip#127.0.0.1}" ] && _non_lo_ip=$_ip
    done< <(LANG=C /sbin/ifconfig)
  printf ${1+-v} $1 "%s${_nl:0:$[${#1}>0?0:1]}" $_non_lo_ip
}

get_non_lo_ip NON_LO_IP
until pg_isready -h $NON_LO_IP -U "postgres" -d "launchpad"; do
  >&2 echo "Postgres is not ready - sleeping..."
  sleep 4
done

>&2 echo "Postgres is up - you can execute commands now"

The above scripts together with a more detailed README are available at https://github.com/cobrainer/pg-docker-with-restored-db以上脚本以及更详细的自述文件可在https://github.com/cobrainer/pg-docker-with-restored-db 获得

I was able to load the data in by pre-pending the run command in the docker file with /etc/init.d/postgresql.我能够通过使用 /etc/init.d/postgresql 在 docker 文件中预先挂起运行命令来加载数据。 My docker file has the following line which is working for me:我的 docker 文件有以下一行对我有用:

RUN /etc/init.d/postgresql start && /usr/bin/psql -a < /tmp/dump.sql

We for E2E test in which we need a database with structure and data already saved in the Docker image we have done the following:对于 E2E 测试,我们需要一个结构和数据已保存在 Docker 映像中的数据库,我们已完成以下操作:

Dockerfile: Dockerfile:

FROM postgres:9.4.24-alpine
ENV POSTGRES_USER postgres
ENV POSTGRES_PASSWORD postgres
ENV PGDATA /pgdata
COPY database.backup /tmp/
COPY database_restore.sh /docker-entrypoint-initdb.d/
RUN /docker-entrypoint.sh --help
RUN rm -rf /docker-entrypoint-initdb.d/database_restore.sh
RUN rm -rf /tmp/database.backup

database_restore.sh: database_restore.sh:

#!/bin/sh
set -e 
pg_restore -C -d postgres /tmp/database.backup

To create the image:要创建图像:

docker build .

To start the container:要启动容器:

docker run --name docker-postgres -d -p 5432:5432 <Id-docker-image>

This does not restore the database every time the container is booted .这不会在每次启动容器时恢复数据库 The structure and data of the database is already contained in the created Docker image.数据库的结构和数据已经包含在创建的 Docker 镜像中。

We have based on this article, but eliminating the multistage: Creating Fast, Lightweight Testing Databases in Docker我们基于这篇文章,但消除了多阶段: 在 Docker 中创建快速、轻量级的测试数据库

Edit : With version 9.4-alpine does not work now because it does not run the database_restore.sh scrips.编辑:使用 9.4-alpine 版本现在不起作用,因为它不运行 database_restore.sh 脚本。 Use version 9.4.24-alpine使用版本 9.4.24-alpine

My goal was to have an image that contains the database - ie saving the time to rebuild it everytime I do docker run oder docker-compose up .我的目标是拥有一个包含数据库的图像 - 即节省每次我执行docker run oder docker-compose up时重建它的时间。

We would just have to manage to get the line exec "$@" out of docker-entrypoint.sh .我们只需要设法从docker-entrypoint.sh取出exec "$@"行。 So I added into my Dockerfile :所以我添加到我的Dockerfile

#Copy my ssql scripts into the image to /docker-entrypoint-initdb.d:
COPY ./init_db /docker-entrypoint-initdb.d

#init db
RUN grep -v 'exec "$@"' /usr/local/bin/docker-entrypoint.sh > /tmp/docker-entrypoint-without-serverstart.sh && \
    chmod a+x /tmp/docker-entrypoint-without-serverstart.sh && \
    /tmp/docker-entrypoint-without-serverstart.sh postgres && \
    rm -rf /docker-entrypoint-initdb.d/* /tmp/docker-entrypoint-without-serverstart.sh

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM