简体   繁体   中英

How to implement Hadoop framework into existing system

I am planning on implementing the Hadoop framework in my web based application. But I just host it in the localhost. I'm planning on implementing the mapreduce and hdfs as the distributed filesystem. The thing is, what is the first step that I have to do? What should I start with? I have made a thorough study on the Hadoop framework. On how it works and try on the wordcount example from the Internet.

Hadoop is a distributed framework for large scale data processing. Your statement "I'm planning on implementing the mapreduce and hdfs as the distributed filesystem." is wrong.

Map reduce is a programming model which can be used on the distributed file system. HDFS is hadoop distributed file system. Together HDFS and MapReduce is set up in the hadoop framework.

Using hadoop you can do data processing offline which is batch processing. So it may not be useful directly in the web application. What you can do is, have your backend data base as HBase for your web application. Hbase is a data warehouse/database runs on top of hadoop.

The first step if you want to kick off is, set up a cluster of machines. May be start with 5 nodes. Else I would recommend cloud solutions. Go for Amazon EMR .

Let me know if this helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM