简体   繁体   中英

What postgres privileges are required for Apache Airflow database backend?

Nothing is mentioned about this in the documentation about "initializing the backend" ( link ).

If I create an airflow_user role and an airflow schema. Is it sufficient if I grant airflow_user USAGE on airflow schema and then SELECT UPDATE INSERT DELETE on all tables? Does that user need GRANT ALL on all tables in the airflow schema?

Is it better to make airflow_user the owner of airflow schema which I understand will allow it all privileges inside that schema because it is the owner?

References:
Postgres GRANT ( link )
Postgres privileges ( link )
Unanswered SO question ( link )

It's not sufficient to create a schema. The schema is not a configurable option in the DSN value configured for AIRFLOW__CORE__SQL_ALCHEMY_CONN .

libpq which psycopg2 driver depends upon doesn't allow to configure search_path as an extra parameter in the DSN.

The best practice is to create a database for Airflow.

For example,

Copy to create-airflow-db.sql

create database airflow;
create user airflow;
alter user airflow with password 'airflow';
grant all on database airflow to airflow;

Start database server in container. Keep this running in a separate terminal.

docker run -it --rm \
--publish '5432:5432' \
-v $PWD/create-airflow-db.sql:/create-airflow-db.sql \
--name postgres \
-e POSTGRES_PASSWORD=password postgres:alpine

Run query in the SQL file in a different terminal.

docker exec -ti postgres psql -w -U postgres -d postgres -f create-airflow-db.sql

Finally, run Airflow service in a new terminal.

docker run --rm \
-it \
-e 'AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql://airflow:airflow@172.17.0.1:5432/airflow' \
--publish '8080:8080' puckel/docker-airflow

You can connect to the database server and list the tables.

➜  airflow docker exec -ti postgres psql -w -U airflow -d airflow
psql (13.1)
Type "help" for help.

airflow=> \dt
                List of relations
 Schema |         Name          | Type  |  Owner
--------+-----------------------+-------+---------
 public | alembic_version       | table | airflow
 public | chart                 | table | airflow
 public | connection            | table | airflow
 public | dag                   | table | airflow
 public | dag_pickle            | table | airflow
 public | dag_run               | table | airflow
 public | dag_tag               | table | airflow
 public | import_error          | table | airflow
 public | job                   | table | airflow
 public | known_event           | table | airflow
 public | known_event_type      | table | airflow
 public | kube_resource_version | table | airflow
 public | kube_worker_uuid      | table | airflow
 public | log                   | table | airflow
 public | serialized_dag        | table | airflow
 public | sla_miss              | table | airflow
 public | slot_pool             | table | airflow
 public | task_fail             | table | airflow
 public | task_instance         | table | airflow
 public | task_reschedule       | table | airflow
 public | users                 | table | airflow
 public | variable              | table | airflow
 public | xcom                  | table | airflow
(23 rows)

Here you see that the tables backing Airflow models are created in the public schema or search path.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM