[英]MSCK Repair Command on AWS Glue Catalog job
Can we have an AWS Glue job scheduled to perform MSCK repair commands so that the metadata for newly added partition gets added to Glue Catalog? 我们可以安排一个AWS Glue作业执行MSCK修复命令,以便将新添加的分区的元数据添加到Glue目录中吗?
Can Glue ETL script perform MSCK REPAIR TABLE command without calling Athena? Glue ETL脚本可以在不调用Athena的情况下执行MSCK REPAIR TABLE命令吗?
This is achieved by Glue Crawlers. 这是通过胶粘剂爬行者实现的。 If you create a crawler it will update the table based on new fields and add new partitions. 如果创建搜寻器,它将基于新字段更新表并添加新分区。
You can call batch_create_partition() API to do it. 您可以调用batch_create_partition()API来执行此操作。 It doesn't require expensive operations like MSCK REPAIR TABLE or re-crawling. 它不需要像MSCK REPAIR TABLE或重新爬网这样的昂贵操作。 Below is my detailed answer with code sample - 以下是我的详细答案和代码示例-
https://stackoverflow.com/a/52239022/2414855 https://stackoverflow.com/a/52239022/2414855
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.