简体   繁体   中英

Data catalog and Meta data management in AWS for a Data Lake architecture

We are setting up a data platform loosely based on the Data Lake architecture. We are evaluating candidates that provide centralized data catalog and meta-data management and tagging. Glue seems very promising, but it's still not out for public consumption, so we looked into

  • Ground
  • Waterline
  • Zaloni

Ground is fairly DYI. It seems we have to extend it extensively to make it work for us. (Scavenging from S3, Writing to Titan)

Waterline and Zaloni are packaged full-blown solutions that might not be what we are looking for since we prefer open-sources, point solutions.

Are there are any alternatives that we should look for? We like the MetaModel available in Ground and are leaning towards using this with Kinesis schema management.

It might be worth reconsidering the DIY route. You'll be wasting a lot of time on building the product you want, and supporting it, instead of using it. I know it's a little marketing fluff, but Zaloni's page says 650% ROI vs. build your own. There's got to be at least a little something in that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM