简体   繁体   中英

Pushing data from AWS lambda to Kinesis Firehose using Python

I am trying to send data from RDS to firehose using Lambda function. I was able to retrieve the data from RDS using a lambda function. Now I want to send that data from Lambda function to kinesis firehose.

I was able to retrieve the data from the RDS using the BOLD code given in the snippet and the input from the RDS is stored in the variable 'rows'. But when I try to insert the data from the RDS into the Kinesis I'm getting this error.

"errorMessage": "a bytes-like object is required, not 'tuple'",

"errorType": "TypeError"

connection = pymysql.connect(host = endpoint, user = username, passwd = password, db = database_name)

FIREHOSE_STREAM = 'DEMOLAMBDAFIREHOSE'
client = boto3.client('firehose')

def lambda_handler(event, context):
        cursor = connection.cursor()
        cursor.execute('SELECT * from inventory.report_product')
        rows = cursor.fetchall()
        
        for row in rows:
          data = base64.b64encode(row)
          response = client.put_record_batch(
                DeliveryStreamName=FIREHOSE_STREAM,
                Records=[
                   {
                   'Data': json.dumps(data)
                    },
                     ]
                  )
        print (response)

Two things to try:

  1. Remove json.dumps() call on data . The put_record_batch() method expects a base64-encoded binary data object for the Data field. json.dumps() returns a string.
  2. Batch rows in groups of 500. The put_record_batch() method supports batching up to 500 records.

Example:

connection = pymysql.connect(host = endpoint, user = username, passwd = password, db = database_name)

FIREHOSE_STREAM = 'DEMOLAMBDAFIREHOSE'
client = boto3.client('firehose')

def lambda_handler(event, context):
    cursor = connection.cursor()
    cursor.execute('SELECT * from inventory.report_product')
    rows = cursor.fetchall()

    records = []
    for row in rows:
        if len(records) < 500:
            records.append({
                'Data': base64.b64encode(row)
            })
        else:
            # call put_record_batch on previous 500 rows
            response = client.put_record_batch(
                DeliveryStreamName=FIREHOSE_STREAM,
                Records=records
            )
            print (response)

            # clear records and add current row
            records = []
            records.append({
                'Data': base64.b64encode(row)
            })

    if len(records) > 0:
        # send the final batch, call put_batch_record()

WARNING: This code example is not tested.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM