繁体   English   中英

从 Athena 查询结果创建 CloudWatch 指标

[英]Creating a CloudWatch Metrics from the Athena Query results

我的要求

我想根据 Athena 查询结果创建一个CloudWatch-Metric

例子

  1. 我想创建一个像每天的 user_count 这样的指标。 在 Athena 中,我会这样写一个 SQL 查询
select date,count(distinct user) as count from users_table group by 1

在 Athena 编辑器中,我可以看到结果,但我想将这些结果作为 Cloudwatch 中的一个指标来查看。

CloudWatch-Metric-Name ==> user_count
Dimensions ==> Date,count

如果我有这个 cloudwatch 指标和维度,我可以轻松地创建一个监控仪表板发送警报

谁能建议一种方法来做到这一点?

这有点复杂,但您可以为此使用 Lambda。 简而言之:

  1. 在 Athena 中设置您的查询并确保它可以使用 Athena 控制台工作。
  2. 创建一个 Lambda:
    • 运行您的 Athena 查询
    • 从 S3 中拉取查询结果
    • 解析查询结果
    • 将查询结果作为指标发送到 CloudWatch
  3. 使用 EventBridge 定期运行您的 Lambda

下面是 Python 中的 Lambda function 执行步骤 #2 的示例。 请注意,Lamda function 需要 IAM 权限才能在 Athena 中运行查询,从 S3 读取结果,然后将指标放入 Cloudwatch。

import time
import boto3


query = 'select count(*) from mytable'
DATABASE = 'default'
bucket='BUCKET_NAME'
path='yourpath'


def lambda_handler(event, context):
    
    #Run query in Athena
    client = boto3.client('athena')
    output =  "s3://{}/{}".format(bucket,path)
    # Execution
    response = client.start_query_execution(
        QueryString=query,
        QueryExecutionContext={
            'Database': DATABASE
        },
        ResultConfiguration={
            'OutputLocation': output,
        }
    )

    #S3 file name uses the QueryExecutionId so 
    #grab it here so we can pull the S3 file.
    qeid = response["QueryExecutionId"]
    
    
    #occasionally the Athena hasn't written the file
    #before the lambda tries to pull it out of S3, so pause a few seconds
    #Note:  You are charged for time the lambda is running.
    #A more elegant but more complicated solution would try to get the 
    #file first then sleep.
    time.sleep(3)
    
    ###### Get query result from S3.
    s3 = boto3.client('s3');
    objectkey = path + "/" + qeid + ".csv"
    #load object as file
    file_content = s3.get_object(
        Bucket=bucket,
        Key=objectkey)["Body"].read()
    #split file on carriage returns
    lines = file_content.decode().splitlines()
    #get the second line in file
    count = lines[1]
    #remove double quotes
    count = count.replace("\"", "")
    #convert string to int since cloudwatch wants numeric for value
    count = int(count)
    
    
    #post query results as a CloudWatch metric
    cloudwatch = boto3.client('cloudwatch')
    response = cloudwatch.put_metric_data(
        MetricData = [
            {
                'MetricName': 'MyMetric',
                'Dimensions': [
                    {
                        'Name': 'DIM1',
                        'Value': 'dim1'
                    },
                ],
                'Unit': 'None',
                'Value': count
            },
        ],
        Namespace = 'MyMetricNS'
    )
    
    return response
    return

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM