[英]mongoDB & Spark: “com.mongodb.MongoSocketReadException: Prematurely reached end of stream”
I have a Java application that processes a Kafka stream of avro messages and for each message performs a query on a mongoDB collection. 我有一个Java应用程序,该程序处理avro消息的Kafka流,并对每个消息在mongoDB集合上执行查询。
After a few dozens messages being properly processed, the application stops running and throws "com.mongodb.MongoSocketReadException: Prematurely reached end of stream". 在正确处理了几十条消息之后,应用程序停止运行并引发“ com.mongodb.MongoSocketReadException:过早到达流的末尾”。
Here's the code: 这是代码:
JavaPairInputDStream<String, byte[]> directKafkaStream = KafkaUtils.createDirectStream(jsc,
String.class, byte[].class, StringDecoder.class, DefaultDecoder.class, kafkaParams, topics);
directKafkaStream.foreachRDD(rdd ->{
rdd.foreach(avroRecord -> {
byte[] encodedAvroData = avroRecord._2;
LocationType t = deserialize(encodedAvroData);
MongoClientOptions.Builder options_builder = new MongoClientOptions.Builder();
options_builder.maxConnectionIdleTime(60000);
MongoClientOptions options = options_builder.build();
MongoClient mongo = new MongoClient ("localhost:27017", options);
MongoDatabase database = mongo.getDatabase("DB");
MongoCollection<Document> collection = database.getCollection("collection");
Document myDoc = collection.find(eq("key", 4)).first();
System.out.println(myDoc);
});
});
First you should not open a mongo connection for each record ! 首先,您不应该为每个记录打开mongo连接! Then you should close your mongo connection.
然后,您应该关闭mongo连接。
Mongo doesn't like when you open many (hundreds, thousands ?) without closing them. 当您打开许多(数百个,数千个?)而不关闭它们时,Mongo不喜欢。
Here is an exemple of what you can do to open mongo connection over a RDD : 这是您可以通过RDD打开mongo连接的示例:
directKafkaStream.foreachRDD(rdd ->{
rdd.foreachPartition(it -> {
// Opens only 1 connection per partition
MongoClient mongo = new MongoClient ("localhost:27017");
MongoDatabase database = mongo.getDatabase("DB");
MongoCollection<Document> collection = database.getCollection("collection");
while (it.hasNext()) {
byte[] encodedAvroData = it.next()._2;
LocationType t = deserialize(encodedAvroData);
Document myDoc = collection.find(eq("key", 4)).first();
System.out.println(myDoc);
}
mongo.close();
});
});
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.