I have an Amazon Elastic Map Reduce (EMR) job that I would like to use to process unloaded data from an Amazon Aurora MySQL table much the same way I do from Amazon Redshift. That is, run a query such as:
unload ('select * from whatever where week = \'2011/11/21\'') to 's3://somebucket' credentials 'blah'
Then, the EMR job processes lines from the dumped data and writes back to S3.
Is this possible? How?
This feature now appears to be supported. The command is called SELECT INTO OUTFILE S3
.
After this answer was originally written (the answer at thst time was "no"), Aurora added this capability.
You can now use the
SELECT INTO OUTFILE S3
SQL statement to query data from an Amazon Aurora database cluster and save it directly into text files in an Amazon S3 bucket. This means you no longer need the two-step process of bringing the data to the SQL client and then copying it from the client to Amazon S3. It's an easy way to export data selectively to Amazon Redshift or any other application.https://aws.amazon.com/about-aws/whats-new/2017/06/amazon-aurora-can-export-data-into-amazon-s3/
Aurora for MySQL doesn't support this.
As you know, on conventional servers, MySQL has two complementary capabilities,
LOAD DATA INFILE
and
SELECT INTO OUTFILE
, which work with local (to the server) files.
In late 2016, Aurora
announced an S3 analog to
LOAD DATA INFILE
--
LOAD DATA FROM S3
-- but there is not, at least as of yet, the opposite capability.
You can use the SELECT INTO OUTFILE S3 statement to query data from an Amazon Aurora MySQL DB cluster and save it directly into text files stored in an Amazon S3 bucket. This feature was added long time ago.
Example:
SELECT * FROM employees INTO OUTFILE S3 's3-us-west-2://aurora-select-into-s3-pdx/sample_employee_data'
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';
And here all options which are supported:
SELECT
[ALL | DISTINCT | DISTINCTROW ]
[HIGH_PRIORITY]
[STRAIGHT_JOIN]
[SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
[SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
select_expr [, select_expr ...]
[FROM table_references
[PARTITION partition_list]
[WHERE where_condition]
[GROUP BY {col_name | expr | position}
[ASC | DESC], ... [WITH ROLLUP]]
[HAVING where_condition]
[ORDER BY {col_name | expr | position}
[ASC | DESC], ...]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]
[PROCEDURE procedure_name(argument_list)]
INTO OUTFILE S3 's3_uri'
[CHARACTER SET charset_name]
[export_options]
[MANIFEST {ON | OFF}]
[OVERWRITE {ON | OFF}]
export_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
You can find this in the AWS Documentation here: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Integrating.SaveIntoS3.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.