[英]Not able to populate AWS Glue ETL Job metrics
I am trying to populate maximum possible Glue job metrics for some testing, below is the setup I have created:我正在尝试为某些测试填充最大可能的 Glue 作业指标,以下是我创建的设置:
The job is running without any issue and i am able to see final data getting dumped into Redshift table, however, in the end, only below 5 Cloudwatch metrics are being populated:作业运行没有任何问题,我可以看到最终数据被转储到 Redshift 表中,但是,最后,只有低于 5 个 Cloudwatch 指标被填充:
There are approximately 20 more metrics which are not getting populated.还有大约 20 个指标没有被填充。
Any suggestions on how to populate those remaining metrics as well?关于如何填充这些剩余指标的任何建议?
Met the same issue.遇到了同样的问题。 Does your glue.s3.filesystem.read_bytes and glue.s3.filesystem.write_bytes have any data?你的glue.s3.filesystem.read_bytes 和glue.s3.filesystem.write_bytes 有数据吗?
One possible reason is that the AWS Glue job metrics not emitted if job completes in less then 30 sec一个可能的原因是,如果作业在 30 秒内完成,则不会发出 AWS Glue 作业指标
While running the job enable the metrics option under monitoring tab.在运行作业时启用监控选项卡下的指标选项。
Assuming that you are using Glue version 2.0 for the above job, please be advised that AWS Glue version 2.0 does not use dynamic allocation, hence the ExecutorAllocationManager metrics are not available.假设您使用 Glue 2.0 版进行上述作业,请注意 AWS Glue 2.0 版不使用动态分配,因此 ExecutorAllocationManager 指标不可用。 Trackback on using Glue 1.0 and you should confirm that all the documented metrics are now available.使用 Glue 1.0 的引用,您应该确认所有记录的指标现在都可用。
https://docs.aws.amazon.com/glue/latest/dg/reduced-start-times-spark-etl-jobs.html#reduced-start-times-limitations https://docs.aws.amazon.com/glue/latest/dg/reduced-start-times-spark-etl-jobs.html#reduced-start-times-limitations
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.