简体   繁体   中英

S3 file to Mysql AWS via Airflow

I been learning how to use Apache-Airflow the last couple of months and wanted to see if anybody has any experience with transferring CSV files from S3 to a Mysql database in AWS(RDS). Or from my Local drive to MySQL.

I managed to send everything to an S3 bucket to store them in the cloud using airflow.hooks.S3_hook and it works great. I used boto3 to do this. Now I want to push this file to a MySQL database I created in RDS, but I have no idea how to do it. Do I need to use the MySQL hook and add my credentials there and then write a python function?

Also, It doesn't have to be S3 to Mysql, I can also try from my local drive to Mysql if it's easier.

Any help would be amazing!

Airflow has S3ToMySqlOperator which can be imported via:

from airflow.providers.mysql.transfers.s3_to_mysql import S3ToMySqlOperator

Note that you will need to install MySQL provider.

For Airflow 1.10 series ( backport version ):

pip install apache-airflow-backport-providers-mysql

For Airflow >=2.0 ( regular version currently in Beta):

pip install apache-airflow-providers-mysql

Example usage:

S3ToMySqlOperator(
    s3_source_key='myfile.csv',
    mysql_table='myfile_table',
    mysql_duplicate_key_handling='IGNORE',
    mysql_extra_options="""
            FIELDS TERMINATED BY ','
            IGNORE 1 LINES
            """,
    task_id= 'transfer_task',
    aws_conn_id='aws_conn',
    mysql_conn_id='mysql_conn',
    dag=dag
)

were you able to resolve the 'MySQLdb._exceptions.OperationalError: (2068, 'LOAD DATA LOCAL INFILE file request rejected due to restrictions on access' issue

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM