简体   繁体   English

使用 Cloud Formation 模板创建 AWS Athena 视图

[英]Creating AWS Athena View using Cloud Formation template

Is it possible to create an Athena view via cloudformation template.是否可以通过 cloudformation 模板创建 Athena 视图。 I can create the view using the Athena Dashboard but I want to do this programmatically using CF templates.我可以使用 Athena Dashboard 创建视图,但我想使用 CF 模板以编程方式执行此操作。 Could not find any details in AWS docs so not sure if supported.在 AWS 文档中找不到任何详细信息,因此不确定是否受支持。

Thanks.谢谢。

In general, CloudFormation is used for deploying infrastructure in a repeatable manner.通常,CloudFormation 用于以可重复的方式部署基础设施。 This doesn't apply much to data inside a database, which typically persists separately to other infrastructure.这并不是一个数据库,它通常单独一直持续到其它基础设施里面同样也适用于数据。

For Amazon Athena, AWS CloudFormation only supports:对于 Amazon Athena,AWS CloudFormation 仅支持:

  • Data Catalog数据目录
  • Named Query命名查询
  • Workgroup工作组

The closest to your requirements is Named Query , which (I think) could store a query that can create the View (eg CREATE VIEW... ).最接近您的要求的是Named Query ,它(我认为)可以存储可以创建视图的查询(例如CREATE VIEW... )。

See: AWS::Athena::NamedQuery - AWS CloudFormation请参阅: AWS::Athena::NamedQuery - AWS CloudFormation

Update: @Theo points out that AWS CloudFormation also has AWS Glue functions that include:更新: @Theo 指出 AWS CloudFormation 还具有 AWS Glue 功能,包括:

  • AWS::Glue::Table AWS::Glue::Table

This can apparently be used to create a view.这显然可用于创建视图。 See comments below.请参阅下面的评论。

It is possible to create views with CloudFormation, it's just very, very , complicated.可以使用 CloudFormation 创建视图,只是非常非常复杂。 Athena views are stored in the Glue Data Catalog, like databases and tables are. Athena 视图存储在 Glue 数据目录中,就像数据库和表一样。 In fact, Athena views are tables in Glue Data Catalog, just with slightly different contents.实际上,Athena 视图是 Glue Data Catalog 中的表,只是内容略有不同。

See this answer for the full description how to create a view programmatically, and you'll get an idea for the complexity: Create AWS Athena view programmatically – it is possible to map that to CloudFormation, but I would not recommend it.有关如何以编程方式创建视图的完整描述,请参阅此答案,您将了解复杂性:以编程方式创建 AWS Athena 视图- 可以将其映射到 CloudFormation,但我不建议这样做。

If you want to create databases and tables with CloudFormation, the resources are AWS::Glue::Database and AWS::Glue::Table .如果要使用 CloudFormation 创建数据库和表,资源为AWS::Glue::DatabaseAWS::Glue::Table

I think for now the best way to create Athena view from CloudFormation template is to use Custom resource and Lambda.我认为目前从CloudFormation模板创建Athena视图的最佳方法是使用自定义资源和 Lambda。 We have to supply methods for View creation and deletion.我们必须提供视图创建和删除的方法。 For example, using crhelper library Lambda could be defined:例如,可以定义使用crhelper库 Lambda:

from __future__ import print_function
from crhelper import CfnResource
import logging
import os
import boto3

logger = logging.getLogger(__name__)
helper = CfnResource(json_logging=False, log_level='DEBUG', boto_level='CRITICAL', sleep_on_delete=120)

try:
    client = boto3.client('athena')
    ATHENA_WORKGROUP = os.environ['athena_workgroup']
    DATABASE = os.environ['database']
    QUERY_CREATE = os.environ['query_create']
    QUERY_DROP = os.environ['query_drop']
except Exception as e:
    helper.init_failure(e)

@helper.create
@helper.update
def create(event, context):
    logger.info("View creation started")

    try:
        executionResponse = client.start_query_execution(
            QueryString=QUERY_CREATE,
            QueryExecutionContext={'Database': DATABASE},
            WorkGroup='AudienceAthenaWorkgroup'
        )
        logger.info(executionResponse)

        response = client.get_query_execution(QueryExecutionId=executionResponse['QueryExecutionId'])
        logger.info(response)

        if response['QueryExecution']['Status']['State'] == 'FAILED':
            logger.error("Query failed")
            raise ValueError("Query failed")

        helper.Data['success'] = True
        helper.Data['id'] = executionResponse['QueryExecutionId']
        helper.Data['message'] = 'query is running'

    except Exception as e:
        print(f"An exception occurred: {e}")

    if not helper.Data.get("success"):
        raise ValueError("Creating custom resource failed.")

    return


@helper.delete
def delete(event, context):
    logger.info("View deletion started")

    try:
        executionResponse = client.start_query_execution(
            QueryString=QUERY_DROP,
            QueryExecutionContext={'Database': DATABASE},
            WorkGroup='AudienceAthenaWorkgroup'
        )
        logger.info(executionResponse)

    except Exception as e:
        print("An exception occurred")
        print(e)

@helper.poll_create
def poll_create(event, context):
    logger.info("Pol creation")

    response = client.get_query_execution(QueryExecutionId=event['CrHelperData']['id'])

    logger.info(f"Poll response: {response}")

    # There are 3 types of state of query
    # if state is failed - we stop and fail creation
    # if state is queued - we continue polling in 2 minutes
    # if state is succeeded - we stop and succeed creation
    if 'FAILED' == response['QueryExecution']['Status']['State']:
        logger.error("Query failed")
        raise ValueError("Query failed")

    if 'SUCCEEDED' == response['QueryExecution']['Status']['State']:
        logger.error("Query SUCCEEDED")
        return True

    if 'QUEUED' == response['QueryExecution']['Status']['State']:
        logger.error("Query QUEUED")
        return False

    # Return a resource id or True to indicate that creation is complete. if True is returned an id
    # will be generated
    # Return false to indicate that creation is not complete and we need to poll again
    return False

def handler(event, context):
    helper(event, context)

The Athena queries for view creation/updation/deletion are passed as environmental parameters to Lambda.用于视图创建/更新/删除的Athena查询作为环境参数传递给 Lambda。 In CloudFormation template we have to define the Lambda that invokes mentioned Python code and creates/updates/deletes Athena view.CloudFormation模板中,我们必须定义调用提到的Python代码并创建/更新/删除Athena视图的 Lambda。 For example例如

  AthenaCommonViewLambda:
    Type: 'AWS::Lambda::Function'
    DependsOn: [CreateAthenaViewLayer, CreateAthenaViewLambdaRole]
    Properties:
      Environment:
        Variables:
          athena_workgroup: !Ref AudienceAthenaWorkgroup
          database:
            Ref: DatabaseName
          query_create: !Sub >-
            CREATE OR REPLACE VIEW ${TableName}_view AS
            SELECT field1, field2, ...
            FROM ${DatabaseName}.${TableName}
          query_drop: !Sub DROP VIEW IF EXISTS ${TableName}_common_view
      Code:
        S3Bucket: !Ref SourceS3Bucket
        S3Key: createview.zip
      FunctionName: !Sub '${AWS::StackName}_create_common_view'
      Handler: createview.handler
      MemorySize: 128
      Role: !GetAtt CreateAthenaViewLambdaRole.Arn
      Runtime: python3.8
      Timeout: 60
      Layers:
        - !Ref CreateAthenaViewLayer

  AthenaCommonView:
    Type: 'Custom::AthenaCommonView'
    Properties:
      ServiceToken: !GetAtt AthenaCommonViewLambda.Arn

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM