简体   繁体   English

从本地 Spark 访问 AWS Glue

[英]Access AWS Glue from local Spark

Is there any way to run local master Spark SQL queries against AWS Glue?有没有办法针对 AWS Glue 运行本地主 Spark SQL 查询?

Launch this code on my local PC:在我的本地 PC 上启动此代码:

SparkSession.builder()
    .master("local")
    .enableHiveSupport()
    .config("hive.metastore.client.factory.class", "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory")
    .getOrCreate()
    .sql("show databases"); // this query isn't running against AWS Glue

EDIT based on some examples it appears that the hive.metastore.uris configuration key should allow specifying a specific metastore url, however, it's not clear how to get the relevant value for glue根据一些示例进行编辑,似乎hive.metastore.uris配置键应该允许指定特定的 Metastore url,但是,尚不清楚如何获取胶水的相关值

SparkSession.builder()
    .master("local")
    .enableHiveSupport()
    .config("hive.metastore.client.factory.class", "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory")
    .config("hive.metastore.uris", "thrift://???:9083")
    .getOrCreate()
    .sql("show databases"); // this query isn't running against AWS Glue

Amazon provide this client that should solve the problem.亚马逊提供了这个应该可以解决问题的客户端。 (didn't try it yet) (还没试过)

https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM