简体   繁体   中英

Issues with streaming data to AWS Kinesis Firehose from Python

Stuck with the this problem for a week now. I honestly think it is a bug somewhere in us-east-1 with Kinesis Firehose at the moment.

At least they automatically create role with wrong Trust Relationship. Here is what created by default: (I changed userID to 123456 everywhere)

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "firehose.amazonaws.com"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "123456"
}
}
}
]
}

When I try to call assume_role from my account I always get:

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:iam::123456:user/fh is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::123456:role/firehose_delivery_role2

User fh has AdministratorAccess policy.

Instead you need to use following trust relation which actually works:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456:root"
},
"Action": "sts:AssumeRole"
}
]
}

But doesn't matter what I do, I always got following message when try to put anything to firehose:

botocore.errorfactory.ResourceNotFoundException: An error occurred (ResourceNotFoundException) when calling the PutRecord operation: Stream test3 under account 123456 not found.

Trying to access it with admin account and without assume_role got me the same result.

My test3 stream delivers data to my elasticsearch.

Can someone create new elasticsearch, kinesis firehose stream and test data delivery? Ideally from python/boto3.

Here is example of code. do not look at variable names ;)

import boto3
import json
from datetime import datetime
import calendar
import random
import time

my_stream_name = 'python-stream'

kinesis_client = boto3.client('sts', aws_access_key_id='key', aws_secret_access_key='secret', region_name='us-east-1')

assumedRoleObject = kinesis_client.assume_role(
RoleArn="arn:aws:iam::123456:role/firehose_delivery_role3",
RoleSessionName="AssumeRoleSession1"
)

kinesis_session = boto3.Session(
aws_access_key_id=assumedRoleObject,
aws_secret_access_key=assumedRoleObject,
aws_session_token=assumedRoleObject)

client = kinesis_session.client('kinesis', region_name='us-east-1')

def put_to_stream(thing_id, property_value, property_timestamp):
payload = {
'prop': str(property_value),
'timestamp': str(property_timestamp),
'thing_id': thing_id
}

print payload

put_response = client.put_record(
StreamName='test3',
Data=json.dumps(payload),
PartitionKey=thing_id)

while True:
property_value = random.randint(40, 120)
property_timestamp = calendar.timegm(datetime.utcnow().timetuple())
thing_id = 'aa-bb'

put_to_stream(thing_id, property_value, property_timestamp)

# wait for 5 second
time.sleep(5)

Stream test3 under account 123456 not found

It's always possible that there's a bug in basic AWS functionality that hasn't been noticed by other users, but it's unlikely.

kinesis_session.client('kinesis', region_name='us-east-1')

This creates a client for Kinesis Data Streams yet your post is about Kinesis Firehose . They're different things, and Boto uses different clients. From the docs :

client = boto3.client('firehose')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM