簡體   English   中英

如何使用 Airflow Postgres 掛鈎將 CSV 加載到 AWS RDS 中?

[英]How do I load a CSV into AWS RDS using an Airflow Postgres hook?

I'm trying to use the copy_expert hook here: https://airflow.apache.org/docs/stable/_modules/airflow/hooks/postgres_hook.html but I don't understand the syntax and I don't have an example跟隨。 我的目標是將 CSV 加載到運行 Postgres 的 AWS RDS 實例中。

hook_copy_expert = airflow.hooks.postgres_hook.PostgresHook('postgres_amazon')

def import_to_postgres():
sql = f"DELETE FROM amazon.amazon_purchases; COPY amazon.amazon_purchases FROM '{path}' DELIMITER ',' CSV HEADER;"
        hook_copy_expert(sql, path, open=open)

t4 = PythonOperator(
    task_id = 'import_to_postgres',
    python_callable = import_to_postgres,
    dag = dag,
    )

當我運行它時,我得到一個錯誤,說name 'sql' is not defined 有人可以幫我理解我做錯了什么嗎?

編輯:我得到了運行的鈎子,但我得到了一個錯誤:

ERROR - must be superuser or a member of the pg_read_server_files role to COPY from a file
HINT:  Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.

我認為使用 Postgres 鈎子的全部意義在於在沒有超級用戶身份的情況下在 SQL 中使用COPY命令? 我究竟做錯了什么?

您不能在 RDS 上運行COPY ,也不能從 PostgreSQL 運算符運行 psql 的\COPY

Unless it's an enormous file, try loading the CSV data into memory with the Python csv module, and then inserting it to the DB.

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM