I am trying to ETL data from a Redshift instance (in a VPC) to a S3 bucket using AWS Glue. For this I created a JDBC connection with Redshift.
The crawler successfully fetches schema information from Redshift to data catalog. But when I run the ETL job it fails to fetch data and says "resource unavailable"
Redshift is inside your VPC. Glue is inside your VPC. S3 isn't. Accessing S3, by default, in most cases, requires access to the Internet.
To access data in S3, you need either a NAT Gatway, a NAT Instance, or an S3 VPC Endpoint to bring a termination point for S3 traffic inside the VPC.
This is still an ongoing issue, for anyone coming across this issue. For my setup it was the availability zone the RDS connection's subnet was in, but as I understand it, this applies to any of the connection types.
The "fix" was to:
If the job still fails with Resource Unavailable, repeat until it works.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.