Boto3 CloudFront Object 使用次數

Question

希望計算我的 CloudFront dist 中所有對象被單獨命中的次數，以便我可以生成一個 excel 表來跟蹤使用情況統計信息。 我一直在查看 CloudFront 的 boto3 文檔，但無法確定可以訪問該信息的位置。 我看到 AWS Cloudfront 控制台生成了“流行對象”報告。 我不確定是否有人知道如何獲取 AWS 在 boto3 中為該報告生成的數字？

如果它不能通過 Boto3 訪問，我應該使用 AWS CLI 命令嗎？

更新：

這是我最終用作偽代碼的內容，希望它是其他人的起點：

import boto3
import gzip
from datetime import datetime, date, timedelta
import shutil
from xlwt import Workbook

def analyze(timeInterval):
    """
    analyze usage data in cloudfront
    :param domain:
    :param id:
    :param password:
    :return: usage data
    """
    outputList = []
    outputDict = {}

    s3 = boto3.resource('s3', aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=PASSWORD)
    data = s3.Bucket(AWS_STORAGE_BUCKET_NAME)
    count = 0
    currentDatetime = str(datetime.now()).split(' ')
    currentDatetime = currentDatetime[0].split('-')
    currentdatetimeYear = int(currentDatetime[0])
    currentdatetimeMonth = int(currentDatetime[1])
    currentdatetimeDay = int(currentDatetime[2])
    currentDatetime = date(year=currentdatetimeYear, month=currentdatetimeMonth, day=currentdatetimeDay)

    # create excel workbook/sheet that we'll save results to
    wb = Workbook()
    sheet1 = wb.add_sheet('Log Results By URL')
    sheet1.write(0, 1, 'File')
    sheet1.write(0, 2, 'Total Hit Count')
    sheet1.write(0, 3, 'Total Byte Count')

    for item in data.objects.all():
        count += 1
        # print(count, '\n', item)
        # print(item.key)
        datetimeRef = str(item.key).replace(CLOUDFRONT_IDENTIFIER+'.', '')
        datetimeRef = datetimeRef.split('.')
        datetimeRef = datetimeRef[0]
        datetimeRef = str(datetimeRef[:-3]).split('-')
        datetimeRefYear = int(datetimeRef[0])
        datetimeRefMonth = int(datetimeRef[1])
        datetimeRefDay = int(datetimeRef[2])
        datetimeRef = date(year=datetimeRefYear, month=datetimeRefMonth, day=datetimeRefDay)
        # print('comparing', datetimeRef - timedelta(days=1), currentDatetime)
        if timeInterval == 'daily':
            if datetimeRef > currentDatetime - timedelta(days=1):
                pass
            else:
                # file not within datetime restrictions, don't do stuff
                continue
        elif timeInterval == 'weekly':
            if datetimeRef > currentDatetime - timedelta(days=7):
                pass
            else:
                # file not within datetime restrictions, don't do stuff
                continue
        elif timeInterval == 'monthly':
            if datetimeRef > currentDatetime - timedelta(weeks=4):
                pass
            else:
                # file not within datetime restrictions, don't do stuff
                continue
        elif timeInterval == 'yearly':
            if datetimeRef > currentDatetime - timedelta(weeks=52):
                pass
            else:
                # file not within datetime restrictions, don't do stuff
                continue
        print('datetimeRef', datetimeRef)
        print('currentDatetime', currentDatetime)
        print('Analyzing File:', item.key)

        # download the file
        s3.Bucket(AWS_STORAGE_BUCKET_NAME).download_file(item.key, 'logFile.gz')

        # unzip the file
        with gzip.open('logFile.gz', 'rb') as f_in:
            with open('logFile.txt', 'wb') as f_out:
                shutil.copyfileobj(f_in, f_out)

        # read the text file and add contents to a list
        with open('logFile.txt', 'r') as f:
            lines = f.readlines()
            localcount = -1
            for line in lines:
                localcount += 1
                if localcount < 2:
                    continue
                else:
                    outputList.append(line)

        # print(outputList)
        # iterate through the data collecting hit counts and byte size
        for dataline in outputList:
            data = dataline.split('\t')
            # print(data)
            if outputDict.get(data[7]) is None:
                outputDict[data[7]] = {'count': 1, 'byteCount': int(data[3])}
            else:
                td = outputDict[data[7]]
                outputDict[data[7]] = {'count': int(td['count']) + 1, 'byteCount': int(td['byteCount']) + int(data[3])}

    # print(outputDict)
    #  iterate through the result dictionary and write to the excel sheet
    outputDictKeys = outputDict.keys()
    count = 1
    for outputDictKey in outputDictKeys:
        sheet1.write(count, 1, str(outputDictKey))
        sheet1.write(count, 2, outputDict[outputDictKey]['count'])
        sheet1.write(count, 3, outputDict[outputDictKey]['byteCount'])
        count += 1
    safeDateTime = str(datetime.now()).replace(':', '.')

    # save the workbook
    wb.save(str(timeInterval)+str('_Log_Result_'+str(safeDateTime)) + '.xls')


if __name__ == '__main__':
    analyze('daily')

Answer 1

從配置和使用標准日志（訪問日志） - Amazon CloudFront ：

您可以將 CloudFront 配置為創建日志文件，其中包含有關 CloudFront 收到的每個用戶請求的詳細信息。 這些稱為標准日志，也稱為訪問日志。 這些標准日志可用於 web 和 RTMP 發行版。 如果您啟用標准日志，您還可以指定您希望 CloudFront 在其中保存文件的 Amazon S3 存儲桶。

日志文件可能非常大，但您可以使用 Amazon Athena 查詢 Amazon CloudFront 日志。

Boto3 CloudFront Object 使用次數

問題描述

1 個解決方案

解決方案1
1 已采納 2020-09-22 22:04:02

Boto3 CloudFront Object 使用次數

問題描述

1 個解決方案

解決方案1 1 已采納 2020-09-22 22:04:02

解決方案1
1 已采納 2020-09-22 22:04:02