[英]oracle to mongodb data migration using kafka
i am trying to migrate data from oracle to mongodb using kafka. 我正在尝试使用kafka将数据从oracle迁移到mongodb。 I took a sample record set of 10 million with column length of 90 each row is of 5Kb
我拿了一个记录记录集1000万,列长度为90,每行的大小为5Kb
i am dividing the data into 10 threads but one of the thread is not running every time.... when i check the data i see 1 million records are missing in mongodb. 我将数据分为10个线程,但每次都不运行其中一个线程...。当我检查数据时,我发现mongodb中缺少100万条记录。
main class: 主类:
int totalRec = countNoOfRecordsToBeProcessed;
int minRownum =0;
int maxRownum =0;
int recInThread=totalRec/10;
System.out.println("oracle "+new Date());
for(int i=minRownum;i<=totalRec;i=i+recInThread+1){
KafkaThread kth = new KafkaThread(i, i+recInThread, conn);
Thread th = new Thread(kth);
th.start();
}
System.out.println("oracle done+ "+new Date());
kafka producer thread class: kafka生产者线程类:
JSONObject obj = new JSONObject(); JSONObject obj =新的JSONObject();
while(rs.next()){ while(rs.next()){
int total_rows = rs.getMetaData().getColumnCount();
for (int i = 0; i < total_rows; i++) {
obj.put(rs.getMetaData().getColumnLabel(i + 1)
.toLowerCase(), rs.getObject(i + 1));
}
//System.out.println("object->"+serializedObject);
producer.send(new ProducerRecord<String, String>("oracle_1",obj.toString()));
obj= new JSONObject();
//System.out.println(counter++);
}
consumer class: 消费类:
KafkaConsumer consumer = new KafkaConsumer<>(props);
//subscribe to topic
consumer.subscribe(Arrays.asList(topicName));
MongoClientURI clientURI = new MongoClientURI(mongoURI);
MongoClient mongoClient = new MongoClient(clientURI);
MongoDatabase database = mongoClient.getDatabase(clientURI.getDatabase());
final MongoCollection<Document> collection = database.getCollection(clientURI.getCollection());
while (true) {
final ConsumerRecords<Long, String> consumerRecords =
consumer.poll(10000);
if (consumerRecords.count()!=0) {
List<InsertOneModel> list1 = new ArrayList<>();
consumerRecords.forEach(record -> {
// System.out.printf("Consumer Record:(%d, %s, %d, %d)\n",
// record.key(), record.value(),
// record.partition(), record.offset());'
String row =null;
row = record.value();
Document doc=Document.parse(row);
InsertOneModel t = new InsertOneModel<>(doc);
list1.add(t);
});
collection.bulkWrite((List<? extends WriteModel<? extends Document>>) (list1), new BulkWriteOptions().ordered(false));
consumer.commitAsync();
list1.clear();
}
}
}
My advice: use Kafka Connect JDBC connector to pull the data in, and a Kafka Connect MongoDB sink to push the data out. 我的建议:使用Kafka Connect JDBC连接器将数据引入,并使用Kafka Connect MongoDB接收器将数据推出。 Otherwise you are just reinventing the wheel.
否则,您只是在重新发明轮子。 Kafka Connect is part of Apache Kafka.
Kafka Connect是Apache Kafka的一部分。
Getting started with Kafka Connect: Kafka Connect入门:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.