如何从 Aws Glue 目录表连接到自定义 python shell 脚本？

Question

I have some tables in aws glue data catalog which have been created by crawling the data from S3 buckets.I am writing my own python shell script to perform some data trasformations for data in those tables.But how can I make the connection to those tables in data catalog via python script?我在 aws 胶水数据目录中有一些表是通过从 S3 存储桶中爬取数据创建的。我正在编写自己的 python shell 脚本来对这些表中的数据执行一些数据转换。但是我如何才能连接到这些表通过 python 脚本在数据目录中？

Answer 1

If you want to access Glue catalog tables inside a python shell job then you can leverage aws-data-wrangler library.Refer to this on how you can import it into your python shell job. If you want to access Glue catalog tables inside a python shell job then you can leverage aws-data-wrangler library.Refer to this on how you can import it into your python shell job.

Also this and this has more examples on how you can read tables from Glue catalog.Below is a simple example that you can use to achieve this:此外， this和this还有更多关于如何从 Glue 目录中读取表格的示例。下面是一个简单的示例，您可以使用它来实现此目的：

dtype = wr.catalog.get_table_types(database="awswrangler_test", table="csv_crawler")

df = wr.athena.read_sql_table(database="awswrangler_test", table="csv_crawler")

如何从 Aws Glue 目录表连接到自定义 python shell 脚本？

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-05-07 01:12:02

如何从 Aws Glue 目录表连接到自定义 python shell 脚本？

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-05-07 01:12:02

解决方案1
1 已采纳 2021-05-07 01:12:02