简体   繁体   English

从 AWS Redshift 到 S3 的 AWS Glue ETL 作业失败

[英]AWS Glue ETL job from AWS Redshift to S3 fails

I am trying out AWS Glue service to ETL some data from redshift to S3.我正在尝试使用 AWS Glue 服务将一些数据从 redshift 传输到 S3。 Crawler runs successfully and creates the meta table in data catalog, however when I run the ETL job ( generated by AWS ) it fails after around 20 minutes saying "Resource unavailable". Crawler 成功运行并在数据目录中创建元表,但是当我运行 ETL 作业(由 AWS 生成)时,它在大约 20 分钟后失败并显示“资源不可用”。

I cannot see AWS glue logs or error logs created in Cloudwatch.我看不到在 Cloudwatch 中创建的 AWS 胶水日志或错误日志。 When I try to view them it says "Log stream not found. The log stream jr_xxxxxxxxxx could not be found. Check if it was correctly created and retry."当我尝试查看它们时,它说“未找到日志流。找不到日志流 jr_xxxxxxxxxx。检查它是否已正确创建并重试。”

I would appreciate it if you could provide any guidance to resolve this issue.如果您能提供任何指导来解决此问题,我将不胜感激。

在此处输入图片说明

So basically, the job you add to Glue will either run if there's not too much traffic in the region your Glue is.所以基本上,如果您的 Glue 所在的区域没有太多流量,您添加到 Glue 的作业就会运行。 If there are no resources available, you need to either manually re-add the job again or you can also bind yourself to events from CloudWatch via SNS .如果没有可用资源,您需要再次手动重新添加作业,或者您也可以通过 SNS 将自己绑定到来自 CloudWatch 的事件

Also, there are parameters you can pass to the job like maximunRetry and timeout .此外,您还可以将参数传递给作业,例如maximunRetrytimeout

If you have a Ressource not available , it won't trigger a retry because the job did not fail, it just didn't even started.如果你有一个Ressource not available ,它不会触发重试,因为作业没有失败,它甚至没有开始。 But if you set the timeout to let's say 60 minutes , it will trigger an error after that time, decrement your retry pool and re-launch the job.但是,如果您将timeout设置为60 minutes ,它将在该时间之后触发错误,减少重试池并重新启动作业。

The closest thing I see to Glue documentation on this is here:我看到的最接近 Glue 文档的内容在这里:

If you encounter errors in AWS Glue, use the following solutions to help you find the source of the problems and fix them.如果您在 AWS Glue 中遇到错误,请使用以下解决方案来帮助您找到问题的根源并修复它们。 Note The AWS Glue GitHub repository contains additional troubleshooting guidance in AWS Glue Frequently Asked Questions. Note AWS Glue GitHub 存储库包含 AWS Glue 常见问题中的其他故障排除指南。 Error: Resource Unavailable If AWS Glue returns a resource unavailable message, you can view error messages or logs to help you learn more about the issue.错误:资源不可用 如果 AWS Glue 返回资源不可用消息,您可以查看错误消息或日志以帮助您了解有关该问题的更多信息。 The following tasks describe general methods for troubleshooting.以下任务描述了故障排除的一般方法。 • A custom DNS configuration without reverse lookup can cause AWS Glue to fail. • 没有反向查找的自定义DNS 配置可能会导致AWS Glue 失败。 Check your DNS configuration.检查您的 DNS 配置。 If you are using Amazon Route 53 or Microsoft Active Directory, make sure that there are forward and reverse lookups.如果您使用的是 Amazon Route 53 或 Microsoft Active Directory,请确保存在正向和反向查找。 For more information, see Setting Up DNS in Your VPC (p. 23).有关更多信息,请参阅在您的 VPC 中设置 DNS (p. 23)。 • For any connections and development endpoints that you use, check that your cluster has not run out of elastic network interfaces. • 对于您使用的任何连接和开发端点,请检查您的集群是否没有用完弹性网络接口。

I have recently struggled with Resource Unavailable thrown by Glue Job我最近一直在努力解决 Glue Job 抛出的 Resource Unavailable

Also i was not able to make a direct connection in Glue using RDS -it said "no suitable security group found"此外,我无法使用 RDS 在 Glue 中建立直接连接 - 它说“找不到合适的安全组”

I faced this issue while trying to connect with AWS RDS and Redshift我在尝试连接 AWS RDS 和 Redshift 时遇到了这个问题

The problem was with the Security Group that the Redshift was using.问题出在 Redshift 使用的安全组上。 There is a need to place a self referencing inbound rule in the Security Group.需要在安全组中放置自引用入站规则。

For those who dont know what is self referencing inbound rule, follow the steps对于那些不知道什么是自引用入站规则的人,请按照以下步骤操作

1) Go to the Security Group you are using (VPC -> Security Group) 1) 转到您正在使用的安全组(VPC -> 安全组)

2) In the Inbound Rules select Edit Inbound Rules 2) 在入站规则中选择编辑入站规则

3) Add a Rule 3) 添加规则

a) Type - All Traffic b) Protocol - All c) Port Range - ALL d) Source - custom and in space available write the initial of your security group and select it. a) 类型 - 所有流量 b) 协议 - 所有 c) 端口范围 - 全部 d) 源 - 自定义和可用空间写入您的安全组的首字母并选择它。 e) Save it. e) 保存。

Its done !大功告成!

if you were missing this condition in your Security Group Inbound Rules如果您在安全组入站规则中缺少此条件

Try creating the connection you will be able to create the connection.尝试创建连接,您将能够创建连接。

Also job should work this time.这次工作也应该工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM