简体繁体中英

DynamoDB - UUID and avoiding a full table scan

原文 2020-07-28 08:59:26 8 2 amazon-web-services/ aws-lambda/ amazon-dynamodb

This is my use case:

I have a JSON Api with 200k objects. The dataset looks a little something like this: date, bike model, production time in min. I use Lambda to read from a JSON Api and write in DynamoDB via http request. The Lambda function runs everyday and updates DynamoDB with the most recent data.

I then retrieve the data by date since I want to calculate the average production time for each day and put it in a second table. An Alexa skill is connected to the second table and reads out the average value for each day.

First question: Since the same bike model is produced multiple times per day, using a composite primary key with date and bike model won't give me a unique key. Shall I create a UUID for the entries instead? Or is there a better solution?

Second question: For the calculation I would need to do a full table scan each time, which is very costly and advised against by many. How can I solve this problem without doing a full table scan?

Third question: Is it better to avoid DynamoDB altogether for my use case? Which AWS database is more suitable for my use case then?

2 answers

Yes, uuid or any other unique identifier (ex: date+bike model+created time) as pk is fine.
It seems your daily job for average value is some sort of data analytics job not really a transaction job. I would suggest to go with a service support data analytics such as Amazon Redshift. You should be able to add data to such database service using Dynamodb streams. Alternatively, you can stream data into s3 and use a service like Athena to get the daily average.

There is a simple database model that you could use for this task:

PartitionKey: a UUID or use any combination of fields that provide uniqueness.
SortKey: Production date, as a string, ie 2020-07-28

If you then create a secondary index which uses as PK the Production date and includes the production time, you can then query (not scan) the secondary index for a specific date and perform any calculations you need on production time. You can then provision the required read/write capacity on the secondary index and the table independently.

Regarding your third question, I don't see any real benefit of using DynamoDB for this task. Any RDS (ie MySQL), Redshift or even S3+Athena can easily handle such use case. If you require real time analytics, you could even consider AWS Kinesis.

Aws DynamoDB DAX Scan table

Java Displaying DynamoDB Scan AttributeValues in table

How to scan the dynamodb table form the AWS Lambda function

How can I get the size of items returned by scan dynamodb table?

How to make a UUID in DynamoDB?

AWS Lambda python boto3 dynamodb table scan - An error occurred (ValidationException) when calling the Scan operation: ExpressionAttributeNames

Google cloud spanner do a full table scan regardless of the definced index

dynamodb scan method returning null

Python boto3 AWS Dynamodb table Query & Scan methods on 'Client' object vs 'Resource' object

Java Lambda for Pagination scan for dynamodb

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Aws DynamoDB DAX Scan table Java Displaying DynamoDB Scan AttributeValues in table How to scan the dynamodb table form the AWS Lambda function How can I get the size of items returned by scan dynamodb table? How to make a UUID in DynamoDB? AWS Lambda python boto3 dynamodb table scan - An error occurred (ValidationException) when calling the Scan operation: ExpressionAttributeNames Google cloud spanner do a full table scan regardless of the definced index dynamodb scan method returning null Python boto3 AWS Dynamodb table Query & Scan methods on 'Client' object vs 'Resource' object Java Lambda for Pagination scan for dynamodb

Related Tags

DynamoDB - UUID and avoiding a full table scan

Question

2 answers

solution1
3 2020-07-28 09:42:58

solution2
3 2020-07-28 09:51:45

DynamoDB - UUID and avoiding a full table scan

Question

2 answers

solution1 3 2020-07-28 09:42:58

solution2 3 2020-07-28 09:51:45

solution1
3 2020-07-28 09:42:58

solution2
3 2020-07-28 09:51:45