Issue with nextSequenceToken for the PutLogEvents API (AWS Cloudwatch logs)

Question

I am facing below error while trying to use AWS API for cloudwatch; the error is coming when i am trying to call: put_log_events. Bit of context: these are organizational cloud trail logs and i am trying to create log group for each account (using kinesis stream for subscribing).

import base64
import gzip
import json
import logging
import os

import boto3

# Setup logging configuration
logging.basicConfig()
logger = logging.getLogger()
logger.setLevel(logging.INFO)

logs_client = boto3.client('logs', region_name=os.getenv('AWS_REGION'))

# setting environment variables
global seq_token
seq_token = None


def unpack_kinesis_stream_records(event):
    # decode and decompress each base64 encoded data element
    return [gzip.decompress(base64.b64decode(k["kinesis"]["data"])).decode('utf-8') for k in event["Records"]]


def decode_raw_cloud_trail_events(cloudTrailEventDataList):
    # Convert Raw Event Data List
    eventList = [json.loads(e) for e in cloudTrailEventDataList]

    # Filter out-non DATA_MESSAGES since we only require cloud watch message type = DATA_MESSAGE
    filteredEvents = [
        e for e in eventList if e["messageType"] == 'DATA_MESSAGE']

    # Covert each individual log Event Message
    events = []
    for f in filteredEvents:
        for e in f["logEvents"]:
            events.append(
                {
                    'timestamp': e["timestamp"],
                    'message': e["message"],
                }
            )

    events.sort(key=lambda x: x["timestamp"])
    logger.info("{0} Event Logs Decoded".format(len(events)))

    log_group = ("log_group_for_cloudtrail_"+eventList[0]["logStream"].split("_")[1])

    log_stream = os.getenv('AWS_LAMBDA_LOG_STREAM_NAME')

    # creating log group
    try:
        logs_client.create_log_group(
            logGroupName=log_group,
            tags={
                'Created By': 'ZH Lamda'
            }
        )
    except logs_client.exceptions.ResourceAlreadyExistsException:
        print("log group exists")

    try:
        logs_client.create_log_stream(
            logGroupName=log_group,
            logStreamName=log_stream,
        )
    except logs_client.exceptions.ResourceAlreadyExistsException:
        print("log stream already exists")

    return [log_group, log_stream, events]


def handle_request(event, context):

    # Unpack Kinesis Stream Records
    kinesis_data = unpack_kinesis_stream_records(event)
    # Decode and filter events
    events = decode_raw_cloud_trail_events(kinesis_data)

    if len(events[2]) == 0:
        return f'No events to process'

    log_event = {
        'logGroupName': events[0],
        'logStreamName': events[1],
        'logEvents': events[2],
    }

    if seq_token is not None:
        log_event['sequenceToken'] = seq_token

    response = logs_client.put_log_events(**log_event)
    seq_token = response['nextSequenceToken']

    return f"Successfully processed {len(events)} records."


def lambda_handler(event, context):
    return handle_request(event, context)

I can see below error in cloud watch logs:

[ERROR] InvalidSequenceTokenException: An error occurred (InvalidSequenceTokenException) when calling the PutLogEvents operation: The given sequenceToken is invalid. The next expected sequenceToken is: null
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 120, in lambda_handler
    return handle_request(event, context)
  File "/var/task/lambda_function.py", line 107, in handle_request
    response = logs_client.put_log_events(**log_event)
  File "/var/runtime/botocore/client.py", line 391, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/var/runtime/botocore/client.py", line 719, in _make_api_call
    raise error_class(parsed_response, operation_name)

Answer 1

I suspect that the Lambda function being invoked multiple times. If so, then the problem is due to global seq_token , which only initializes the value of the variable the first time the function is invoked .

On future invocations, the seq_token is already set from the previous run, and is never reset to None. As a result, when put_log_events() is next called, the if statement is setting a sequence value from an old execution.

To fix all this, initialize the seq_token variable within the handle_request() function rather than making it global.

Issue with nextSequenceToken for the PutLogEvents API (AWS Cloudwatch logs)

Question

1 answers

solution1
0 2022-08-20 11:28:11

Issue with nextSequenceToken for the PutLogEvents API (AWS Cloudwatch logs)

Question

1 answers

solution1 0 2022-08-20 11:28:11

solution1
0 2022-08-20 11:28:11