简体   繁体   中英

Cascading Framework vs ETL tools like Talend

We have been using Cascading framework for creating ETL.

Cascading gives.

  1. optimized joins
  2. Parallel running jobs
  3. Creating checkpoints
  4. Developers can work on their favorite language(java,ruby,scala,clojure)
  5. Unit Testing.

Now we have two options converting some X ETL(which is costly) jobs into hadoop jobs

  1. Cascading work flows.
  2. Talend jobs.

My question is.

  1. Talend uses pig, hive, etc as components to create a job. Then do we have some benefits on performance or does talend does any improvisation on it?
  2. As far as Talend is concerned do we need to worry about unit testing(which Cascading framework provides)?
  3. If we choose Talend over cascading for creating jobs(converting X ETL to hadoop jobs), then is it a good option.
  4. converting X ETL to cascading workflows will require to create all the components available in the given X ETL, but will be one time activity. Then we need to think on other feature also which are provided by Talend Studio like:

     a. Data quality. b. Data Profiling. c. Data lineage, etc. 
  5. As far as maintainability is concerned Cascading jobs are pretty well managed, can any one give some info on talend.

Bottom line is I am creating a conversion tool from X ETL to hadoop jobs. And I need to choose from Cascading framework or Talend.

I cant't answer all your question but i can give you my return on experience. With Talend development is most productive than From wark or native language , and source is most easy to maintain because component are optimized and the IDE for your Job is very clear . The debuging features are good , you can do step bu step debugging and you can the your generate sources.

For me the inconvenients are the configuration management , Talend is not very successful to work with many branchs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM