[英]How do I load a CSV into AWS RDS using an Airflow Postgres hook?
I'm trying to use the copy_expert
hook here: https://airflow.apache.org/docs/stable/_modules/airflow/hooks/postgres_hook.html but I don't understand the syntax and I don't have an example跟随。 我的目标是将 CSV 加载到运行 Postgres 的 AWS RDS 实例中。
hook_copy_expert = airflow.hooks.postgres_hook.PostgresHook('postgres_amazon')
def import_to_postgres():
sql = f"DELETE FROM amazon.amazon_purchases; COPY amazon.amazon_purchases FROM '{path}' DELIMITER ',' CSV HEADER;"
hook_copy_expert(sql, path, open=open)
t4 = PythonOperator(
task_id = 'import_to_postgres',
python_callable = import_to_postgres,
dag = dag,
)
当我运行它时,我得到一个错误,说name 'sql' is not defined
。 有人可以帮我理解我做错了什么吗?
编辑:我得到了运行的钩子,但我得到了一个错误:
ERROR - must be superuser or a member of the pg_read_server_files role to COPY from a file
HINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.
我认为使用 Postgres 钩子的全部意义在于在没有超级用户身份的情况下在 SQL 中使用COPY
命令? 我究竟做错了什么?
您不能在 RDS 上运行COPY
,也不能从 PostgreSQL 运算符运行 psql 的\COPY
。
Unless it's an enormous file, try loading the CSV data into memory with the Python csv module, and then inserting it to the DB.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.