[英]How do I load a CSV into AWS RDS using an Airflow Postgres hook?
I'm trying to use the copy_expert
hook here: https://airflow.apache.org/docs/stable/_modules/airflow/hooks/postgres_hook.html but I don't understand the syntax and I don't have an example跟隨。 我的目標是將 CSV 加載到運行 Postgres 的 AWS RDS 實例中。
hook_copy_expert = airflow.hooks.postgres_hook.PostgresHook('postgres_amazon')
def import_to_postgres():
sql = f"DELETE FROM amazon.amazon_purchases; COPY amazon.amazon_purchases FROM '{path}' DELIMITER ',' CSV HEADER;"
hook_copy_expert(sql, path, open=open)
t4 = PythonOperator(
task_id = 'import_to_postgres',
python_callable = import_to_postgres,
dag = dag,
)
當我運行它時,我得到一個錯誤,說name 'sql' is not defined
。 有人可以幫我理解我做錯了什么嗎?
編輯:我得到了運行的鈎子,但我得到了一個錯誤:
ERROR - must be superuser or a member of the pg_read_server_files role to COPY from a file
HINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.
我認為使用 Postgres 鈎子的全部意義在於在沒有超級用戶身份的情況下在 SQL 中使用COPY
命令? 我究竟做錯了什么?
您不能在 RDS 上運行COPY
,也不能從 PostgreSQL 運算符運行 psql 的\COPY
。
Unless it's an enormous file, try loading the CSV data into memory with the Python csv module, and then inserting it to the DB.
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.