简体繁体中英

data from mutiple mysql tables to hadoop map-reduce

原文 2012-03-02 12:31:54 4 2 java/ mysql/ hadoop/ mapreduce

We have following scenario:

We have a chain of map-reduce processes implemented in java.Currently we are reading data from a mysql table and saving output to another mysql table .Now we may need data from another table as input to map/reduce process.

Possible Solutions:

a) Either we can have a join query for input to map process or

b) we can read needed data by making simple jdbc connection and requesting data again and again(although, i don't prefer it).

Questions:

What are the best practices in such scenario? We may move to mongoDB in future.What will be best practice in that scenario?

2 answers

我认为目前不可能。

SQOOP and HIVE can be used.

You can use SQOOP for transfering data from mysql table to HDFS and then to HIVE . From HIVE (after operations) , you can export the tables back to Mysql.

Example :

First of all download mysql-connector-java-5.0.8 and put the jar to lib and bin folder of Sqoop
Create the table definition in Hive with exact field names and types as in mysql

sqoop import --verbose --fields-terminated-by ',' --connect jdbc:mysql://localhost/test --table employee --hive-import --warehouse-dir /user/hive/warehouse --fields-terminated-by ',' --split-by id --hive-table employee

Follow this Link for reference

Grouping joined data in Hadoop map-reduce

Combining results from hadoop map-reduce

Hadoop Map-Reduce . RecordReader

Hadoop map-reduce programming

Running a local hadoop map-reduce does not partition data as expected

Add input data on the fly to Hadoop Map-Reduce Job?

Running a Hadoop Map-Reduce Job

Benchmarking Hadoop Map-Reduce application

Hadoop Map-Reduce Output File Exception

Why is Hadoop Map-Reduce application processing the same data in two different reduce tasks?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Grouping joined data in Hadoop map-reduce Combining results from hadoop map-reduce Hadoop Map-Reduce . RecordReader Hadoop map-reduce programming Running a local hadoop map-reduce does not partition data as expected Add input data on the fly to Hadoop Map-Reduce Job? Running a Hadoop Map-Reduce Job Benchmarking Hadoop Map-Reduce application Hadoop Map-Reduce Output File Exception Why is Hadoop Map-Reduce application processing the same data in two different reduce tasks?

Related Tags

data from mutiple mysql tables to hadoop map-reduce

Question

2 answers

solution1
0 ACCPTED 2012-08-28 11:52:02

solution2
0 2012-03-02 12:58:33

data from mutiple mysql tables to hadoop map-reduce

Question

2 answers

solution1 0 ACCPTED 2012-08-28 11:52:02

solution2 0 2012-03-02 12:58:33

solution1
0 ACCPTED 2012-08-28 11:52:02

solution2
0 2012-03-02 12:58:33