简体   繁体   English

如何从 Aws Glue 目录表连接到自定义 python shell 脚本?

[英]How to make connection from Aws Glue Catalog tables to custom python shell script?

I have some tables in aws glue data catalog which have been created by crawling the data from S3 buckets.I am writing my own python shell script to perform some data trasformations for data in those tables.But how can I make the connection to those tables in data catalog via python script?我在 aws 胶水数据目录中有一些表是通过从 S3 存储桶中爬取数据创建的。我正在编写自己的 python shell 脚本来对这些表中的数据执行一些数据转换。但是我如何才能连接到这些表通过 python 脚本在数据目录中?

If you want to access Glue catalog tables inside a python shell job then you can leverage aws-data-wrangler library.Refer to this on how you can import it into your python shell job. If you want to access Glue catalog tables inside a python shell job then you can leverage aws-data-wrangler library.Refer to this on how you can import it into your python shell job.

Also this and this has more examples on how you can read tables from Glue catalog.Below is a simple example that you can use to achieve this:此外, thisthis还有更多关于如何从 Glue 目录中读取表格的示例。下面是一个简单的示例,您可以使用它来实现此目的:

dtype = wr.catalog.get_table_types(database="awswrangler_test", table="csv_crawler")

df = wr.athena.read_sql_table(database="awswrangler_test", table="csv_crawler")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM