简体繁体中英

AWS Data Lake Dynamo vs ElasticSearch

原文 2017-10-09 18:38:15 2 1 amazon-web-services/ elasticsearch/ amazon-s3/ amazon-dynamodb/ data-lake

I am really struggling to understand how Dynamo / ElasticSearch should be used to support AWS data lake efforts (Metadata / Catalogs). It seems as though you would log the individual S3 locations of your zip archives for your sources in Dynamo and any additional metadata / attributes you would like to search by in ES. If that is correct, how would you use the two together to support that. I tried to find more detailed information about how to properly pair the two together, but have been unsuccessful. Any information / documentation that others have would be great. Good chance I am overlooking some obvious examples / documentation.

What I am imagining is something like the following:

User could search for metadata / attributes in ES that would point to the high-level S3 buckets / partitions that match.
The search in DynamoDB would be against the part of the key (Partition / bucket) from the ES result
The search would most likely result in many individual objects / keys that could then be processed, extracted, etc.

1 answers

I spoke to one of our AWS reps, who referred me to this article. It was a great starting point. AWS Data Lake . This seemed to answer some of my questions about the user of components and approach, that was previously unclear to me.

Highlights:

Blueprint for implementing a data lake. Combining S3 / DynamoDB / ES is common.
There are many variations to the implementation. Substituting an RDS for ES / DynamoDB, using just ES, etc.
We will most likely start with an RDS to workout the process, then move to DyanmoDB / ES.

Traditional Data Lake vs AWS Lake Formation

AWS Data Lake Ingest

Pricing: AWS Dynamo db vs AWS Cloudwatch

AWS Data Pipeline Dynamo to Redshift

Search in list data type in dynamo db aws

Aws dynamo db mapper is not converting data

AWS: Storing API token in Secrets Manager vs Dynamo DB

AWS Neptune DB vs. Dynamo DB for entity lineage

Data catalog and Meta data management in AWS for a Data Lake architecture

Dynamo db Seed data insertion not working when deploying to aws

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Traditional Data Lake vs AWS Lake Formation AWS Data Lake Ingest Pricing: AWS Dynamo db vs AWS Cloudwatch AWS Data Pipeline Dynamo to Redshift Search in list data type in dynamo db aws Aws dynamo db mapper is not converting data AWS: Storing API token in Secrets Manager vs Dynamo DB AWS Neptune DB vs. Dynamo DB for entity lineage Data catalog and Meta data management in AWS for a Data Lake architecture Dynamo db Seed data insertion not working when deploying to aws

Related Tags

AWS Data Lake Dynamo vs ElasticSearch

Question

1 answers

solution1 2 ACCPTED 2017-10-31 15:17:16

solution1
2 ACCPTED 2017-10-31 15:17:16