[英]How do I load a CSV into AWS RDS using an Airflow Postgres hook?
I'm trying to use the copy_expert
hook here: https://airflow.apache.org/docs/stable/_modules/airflow/hooks/postgres_hook.html but I don't understand the syntax and I don't have an example to follow. I'm trying to use the
copy_expert
hook here: https://airflow.apache.org/docs/stable/_modules/airflow/hooks/postgres_hook.html but I don't understand the syntax and I don't have an example跟随。 My goal is to load a CSV into an AWS RDS instance running Postgres.我的目标是将 CSV 加载到运行 Postgres 的 AWS RDS 实例中。
hook_copy_expert = airflow.hooks.postgres_hook.PostgresHook('postgres_amazon')
def import_to_postgres():
sql = f"DELETE FROM amazon.amazon_purchases; COPY amazon.amazon_purchases FROM '{path}' DELIMITER ',' CSV HEADER;"
hook_copy_expert(sql, path, open=open)
t4 = PythonOperator(
task_id = 'import_to_postgres',
python_callable = import_to_postgres,
dag = dag,
)
When I run this, I get an error saying name 'sql' is not defined
.当我运行它时,我得到一个错误,说
name 'sql' is not defined
。 Can someone help me understand what I'm doing wrong?有人可以帮我理解我做错了什么吗?
Edit: I got the hook to run but I got an error:编辑:我得到了运行的钩子,但我得到了一个错误:
ERROR - must be superuser or a member of the pg_read_server_files role to COPY from a file
HINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.
I thought the whole point of using the Postgres hook was to use the COPY
command in SQL without having superuser status?我认为使用 Postgres 钩子的全部意义在于在没有超级用户身份的情况下在 SQL 中使用
COPY
命令? What am I doing wrong?我究竟做错了什么?
You can't run COPY
on RDS, and you can't run psql's \COPY
from a PostgreSQL operator.您不能在 RDS 上运行
COPY
,也不能从 PostgreSQL 运算符运行 psql 的\COPY
。
Unless it's an enormous file, try loading the CSV data into memory with the Python csv module, and then inserting it to the DB. Unless it's an enormous file, try loading the CSV data into memory with the Python csv module, and then inserting it to the DB.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.