简体   繁体   English

使用副本集和用于Spark的mongo-hadoop连接器连接到Mongo

[英]Connecting to Mongo with replica set and mongo-hadoop connector for Spark

I have a Spark process that is currently using the mongo-hadoop bridge (from https://github.com/mongodb/mongo-hadoop/blob/master/spark/src/main/python/README.rst ) to access the mongo database: 我有一个Spark进程,当前正在使用mongo-hadoop桥(来自https://github.com/mongodb/mongo-hadoop/blob/master/spark/src/main/python/README.rst )访问mongo数据库:

mongo_url = 'mongodb://localhost:27017/db_name.collection_name'
mongo_rdd = spark_context.mongoRDD(mongo_url)

The mongo instance is now being upgraded to a cluster that can only be accessed with a replica set. mongo实例现在正在升级到只能使用副本集访问的群集。

How do I create an RDD using the mongo-hadoop connector? 如何使用mongo-hadoop连接器创建RDD? The mongoRDD() goes to mongoPairRDD(), which may not take multiple strings. mongoRDD()进入mongoPairRDD(),它可能不包含多个字符串。

The MongoDB Hadoop Connector mongoRDD can take a valid MongoDB Connection String . MongoDB Hadoop连接器mongoRDD可以采用有效的MongoDB连接字符串

For example, if it's now a replica set you can specify: 例如,如果现在是副本集,则可以指定:

mongodb://db1.example.net,db2.example.net:27002,db3.example.net:27003/?db_name&replicaSet=YourReplicaSetName

See also related information: 另请参阅相关信息:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM