简体   繁体   中英

How can I unload data from Amazon Aurora MySQL to Amazon S3 like Amazon Redshift?

I have an Amazon Elastic Map Reduce (EMR) job that I would like to use to process unloaded data from an Amazon Aurora MySQL table much the same way I do from Amazon Redshift. That is, run a query such as:

unload ('select * from whatever where week = \'2011/11/21\'') to 's3://somebucket' credentials 'blah'

Then, the EMR job processes lines from the dumped data and writes back to S3.

Is this possible? How?

This feature now appears to be supported. The command is called SELECT INTO OUTFILE S3 .

After this answer was originally written (the answer at thst time was "no"), Aurora added this capability.

You can now use the SELECT INTO OUTFILE S3 SQL statement to query data from an Amazon Aurora database cluster and save it directly into text files in an Amazon S3 bucket. This means you no longer need the two-step process of bringing the data to the SQL client and then copying it from the client to Amazon S3. It's an easy way to export data selectively to Amazon Redshift or any other application.

https://aws.amazon.com/about-aws/whats-new/2017/06/amazon-aurora-can-export-data-into-amazon-s3/

Aurora for MySQL doesn't support this.

\n

As you know, on conventional servers, MySQL has two complementary capabilities, LOAD DATA INFILE and SELECT INTO OUTFILE , which work with local (to the server) files. In late 2016, Aurora announced an S3 analog to LOAD DATA INFILE -- LOAD DATA FROM S3 -- but there is not, at least as of yet, the opposite capability.

You can use the SELECT INTO OUTFILE S3 statement to query data from an Amazon Aurora MySQL DB cluster and save it directly into text files stored in an Amazon S3 bucket. This feature was added long time ago.

Example:

SELECT * FROM employees INTO OUTFILE S3 's3-us-west-2://aurora-select-into-s3-pdx/sample_employee_data' 
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';

And here all options which are supported:

SELECT
    [ALL | DISTINCT | DISTINCTROW ]
        [HIGH_PRIORITY]
        [STRAIGHT_JOIN]
        [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
        [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
    select_expr [, select_expr ...]
    [FROM table_references
        [PARTITION partition_list]
    [WHERE where_condition]
    [GROUP BY {col_name | expr | position}
        [ASC | DESC], ... [WITH ROLLUP]]
    [HAVING where_condition]
    [ORDER BY {col_name | expr | position}
         [ASC | DESC], ...]
    [LIMIT {[offset,] row_count | row_count OFFSET offset}]
    [PROCEDURE procedure_name(argument_list)]
INTO OUTFILE S3 's3_uri'
[CHARACTER SET charset_name]
    [export_options]
    [MANIFEST {ON | OFF}]
    [OVERWRITE {ON | OFF}]

export_options:
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]

You can find this in the AWS Documentation here: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Integrating.SaveIntoS3.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM