[英]MongoDB SDK Failover not working
我已經使用三台機器(192.168.122.21、192.168.122.147和192.168.122.148)設置了一個副本集,並且正在使用Java SDK與MongoDB Cluster進行交互:
ArrayList<ServerAddress> addrs = new ArrayList<ServerAddress>();
addrs.add(new ServerAddress("192.168.122.21", 27017));
addrs.add(new ServerAddress("192.168.122.147", 27017));
addrs.add(new ServerAddress("192.168.122.148", 27017));
this.mongoClient = new MongoClient(addrs);
this.db = this.mongoClient.getDB(this.db_name);
this.collection = this.db.getCollection(this.collection_name);
建立連接后,我將多次插入一個簡單的測試文檔:
for (int i = 0; i < this.inserts; i++) {
try {
this.collection.insert(new BasicDBObject(String.valueOf(i), "test"));
} catch (Exception e) {
System.out.println("Error on inserting element: " + i);
e.printStackTrace();
}
}
在模擬主服務器的節點崩潰(關閉電源)時,MongoDB集群會成功進行故障轉移:
19:08:03.907+0100 [rsHealthPoll] replSet info 192.168.122.21:27017 is down (or slow to respond):
19:08:03.907+0100 [rsHealthPoll] replSet member 192.168.122.21:27017 is now in state DOWN
19:08:04.153+0100 [rsMgr] replSet info electSelf 1
19:08:04.154+0100 [rsMgr] replSet couldn't elect self, only received -9999 votes
19:08:05.648+0100 [conn15] replSet info voting yea for 192.168.122.148:27017 (2)
19:08:10.681+0100 [rsMgr] replSet not trying to elect self as responded yea to someone else recently
19:08:10.910+0100 [rsHealthPoll] replset info 192.168.122.21:27017 heartbeat failed, retrying
19:08:16.394+0100 [rsMgr] replSet not trying to elect self as responded yea to someone else recently
19:08:22.876+.
19:08:22.912+0100 [rsHealthPoll] replset info 192.168.122.21:27017 heartbeat failed, retrying
19:08:23.623+0100 [SyncSourceFeedbackThread] replset setting syncSourceFeedback to 192.168.122.148:27017
19:08:23.917+0100 [rsHealthPoll] replSet member 192.168.122.148:27017 is now in state PRIMARY
客戶端的MongoDB驅動程序也可以識別這一點:
Dec 01, 2014 7:08:16 PM com.mongodb.ConnectionStatus$UpdatableNode update
WARNING: Server seen down: /192.168.122.21:27017 - java.io.IOException - message: Read timed out
WARNING: Server seen down: /192.168.122.21:27017 - java.io.IOException - message: couldn't connect to [/192.168.122.21:27017] bc:java.net.SocketTimeoutException: connect timed out
Dec 01, 2014 7:08:36 PM com.mongodb.DBTCPConnector setMasterAddress
WARNING: Primary switching from /192.168.122.21:27017 to /192.168.122.148:27017
但是它仍然一直嘗試(永遠)連接到舊節點:
Dec 01, 2014 7:08:50 PM com.mongodb.ConnectionStatus$UpdatableNode update
WARNING: Server seen down: /192.168.122.21:27017 - java.io.IOException - message: couldn't connect to [/192.168.122.21:27017] bc:java.net.NoRouteToHostException: No route to host
.....
Dec 01, 2014 7:10:43 PM com.mongodb.ConnectionStatus$UpdatableNode update
WARNING: Server seen down: /192.168.122.21:27017 - java.io.IOException -message: couldn't connect to [/192.168.122.21:27017] bc:java.net.NoRouteToHostException: No route to host
從主數據庫發生故障並從輔助數據庫變為主數據庫的那一刻起,數據庫上的文檔計數就保持不變。 這是該過程中同一節點的輸出:
“ rs0”:SECONDARY> db.test_collection.find()。count()12260161
“ rs0”:PRIMARY> db.test_collection.find()。count()12260161
更新:使用未確認的WriteConcern可以按設計工作。 插入操作也會在新的母版上執行,並且選舉過程中的所有操作都會丟失。
有了WriteConcern Acknowleged,看來操作無限期地等待着崩潰的主機的ACK。 這可以解釋為什么在崩潰的服務器再次啟動並再次加入群集后,程序仍繼續運行的原因。 但就我而言,我不希望驅動程序永遠等待,它應該在一定時間后引發錯誤。
更新:殺死主數據庫上的mongod進程時,已確認WriteConcern的功能也按預期工作。 在這種情況下,故障轉移僅需約3秒。 在此期間,不執行插入操作,在選擇新的主數據庫之后,插入操作將繼續。
因此,只有在模擬節點故障(斷電/網絡關閉)時才出現問題。 在這種情況下,操作將掛起,直到故障節點再次啟動。
您的應用仍然可以使用嗎? 由於該服務器仍在您的種子列表中,據我所知,驅動程序將嘗試連接到該服務器。 只要您的種子列表中的任何其他服務器都能獲得主要狀態,您的應用程序就應該仍然可以運行。
明確指定連接超時值可解決該錯誤。 另請參閱: http : //api.mongodb.org/java/2.7.0/com/mongodb/MongoOptions.html
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.