[英]Connecting to Mongo with replica set and mongo-hadoop connector for Spark
I have a Spark process that is currently using the mongo-hadoop bridge (from https://github.com/mongodb/mongo-hadoop/blob/master/spark/src/main/python/README.rst ) to access the mongo database: 我有一个Spark进程,当前正在使用mongo-hadoop桥(来自https://github.com/mongodb/mongo-hadoop/blob/master/spark/src/main/python/README.rst )访问mongo数据库:
mongo_url = 'mongodb://localhost:27017/db_name.collection_name'
mongo_rdd = spark_context.mongoRDD(mongo_url)
The mongo instance is now being upgraded to a cluster that can only be accessed with a replica set. mongo实例现在正在升级到只能使用副本集访问的群集。
How do I create an RDD using the mongo-hadoop connector? 如何使用mongo-hadoop连接器创建RDD? The mongoRDD() goes to mongoPairRDD(), which may not take multiple strings.
mongoRDD()进入mongoPairRDD(),它可能不包含多个字符串。
The MongoDB Hadoop Connector mongoRDD
can take a valid MongoDB Connection String . MongoDB Hadoop连接器
mongoRDD
可以采用有效的MongoDB连接字符串 。
For example, if it's now a replica set you can specify: 例如,如果现在是副本集,则可以指定:
mongodb://db1.example.net,db2.example.net:27002,db3.example.net:27003/?db_name&replicaSet=YourReplicaSetName
See also related information: 另请参阅相关信息:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.