简体   繁体   English

Mongodb“NetworkInterfaceExceededTimeLimit”副本集

[英]Mongodb "NetworkInterfaceExceededTimeLimit" replicaset

Solved:解决了:

It most have been a version specific bug.它大多数是特定于版本的错误。 Updating to mongodb-org@4.0.5 solved the issue and the machines connect fine.更新到 mongodb-org@4.0.5 解决了这个问题,机器连接正常。 I'll leave the question for eventualy someone who faces the same issue.我会把问题留给最终面临同样问题的人。

Original question:原问题:

I have two servers in a two member replica set.我在两个成员的副本集中有两台服务器。

mx.aireclaim.com is the PRIMARY and vimax.aireclaim.com is the SECONDARY. mx.aireclaim.com 是主要的,vimax.aireclaim.com 是次要的。 After a reboot on mx.aireclaim.com I'm faced with a connection issue on the secondary side.在 mx.aireclaim.com 上重新启动后,我在次级端遇到了连接问题。 I've checked the firewall and can access the 27017 port from each to machine to the other.我检查了防火墙,可以从每个机器到另一个机器访问 27017 端口。

I keep getting this error in my secondary:我在我的中学不断收到此错误:

Error in heartbeat (...) response status: NetworkInterfaceExceededTimeLimit:
Couldn't get a connection within the time limit

Is there something foul I'm missing?我错过了什么犯规吗? I've tried syncing the date on both machines, disabling firewalls, multiple restarts, and config changes, to no avail.我试过在两台机器上同步日期、禁用防火墙、多次重启和配置更改,但无济于事。

Here are the replicaset config and rs.status():这是副本集配置和 rs.status():

PRIMARY基本的

    aireclaimRs:PRIMARY> rs.config()
    {
        "_id" : "aireclaimRs",
        "version" : 3,
        "protocolVersion" : NumberLong(1),
        "members" : [
            {
                "_id" : 0,
                "host" : "vimax.aireclaim.com:27017",
                "arbiterOnly" : false,
                "buildIndexes" : true,
                "hidden" : false,
                "priority" : 2,
                "tags" : {
                    
                },
                "slaveDelay" : NumberLong(0),
                "votes" : 1
            },
            {
                "_id" : 1,
                "host" : "mx.aireclaim.com:27017",
                "arbiterOnly" : false,
                "buildIndexes" : true,
                "hidden" : false,
                "priority" : 1,
                "tags" : {
                    
                },
                "slaveDelay" : NumberLong(0),
                "votes" : 1
            }
        ],
        "settings" : {
            "chainingAllowed" : true,
            "heartbeatIntervalMillis" : 2000,
            "heartbeatTimeoutSecs" : 10,
            "electionTimeoutMillis" : 10000,
            "catchUpTimeoutMillis" : -1,
            "catchUpTakeoverDelayMillis" : 30000,
            "getLastErrorModes" : {
                
            },
            "getLastErrorDefaults" : {
                "w" : 1,
                "wtimeout" : 0
            },
            "replicaSetId" : ObjectId("5be0f8105dfcbe069f5c8533")
        }
    }
    aireclaimRs:PRIMARY> rs.status()
    {
        "set" : "aireclaimRs",
        "date" : ISODate("2020-03-23T20:08:25.719Z"),
        "myState" : 1,
        "term" : NumberLong(20),
        "heartbeatIntervalMillis" : NumberLong(2000),
        "optimes" : {
            "lastCommittedOpTime" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "appliedOpTime" : {
                "ts" : Timestamp(1584994100, 1),
                "t" : NumberLong(20)
            },
            "durableOpTime" : {
                "ts" : Timestamp(1584994100, 1),
                "t" : NumberLong(20)
            }
        },
        "members" : [
            {
                "_id" : 0,
                "name" : "vimax.aireclaim.com:27017",
                "health" : 1,
                "state" : 2,
                "stateStr" : "SECONDARY",
                "uptime" : 1595,
                "optime" : {
                    "ts" : Timestamp(1584838063, 1),
                    "t" : NumberLong(11)
                },
                "optimeDurable" : {
                    "ts" : Timestamp(1584838063, 1),
                    "t" : NumberLong(11)
                },
                "optimeDate" : ISODate("2020-03-22T00:47:43Z"),
                "optimeDurableDate" : ISODate("2020-03-22T00:47:43Z"),
                "lastHeartbeat" : ISODate("2020-03-23T20:08:24.148Z"),
                "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
                "pingMs" : NumberLong(0),
                "configVersion" : 3
            },
            {
                "_id" : 1,
                "name" : "mx.aireclaim.com:27017",
                "health" : 1,
                "state" : 1,
                "stateStr" : "PRIMARY",
                "uptime" : 8376,
                "optime" : {
                    "ts" : Timestamp(1584994100, 1),
                    "t" : NumberLong(20)
                },
                "optimeDate" : ISODate("2020-03-23T20:08:20Z"),
                "electionTime" : Timestamp(1584992519, 1),
                "electionDate" : ISODate("2020-03-23T19:41:59Z"),
                "configVersion" : 3,
                "self" : true
            }
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1584994100, 1),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1584994100, 1),
            "signature" : {
                "hash" : BinData(0,"oGCz38lgYjsGTrWc3maAD2vyc6M="),
                "keyId" : NumberLong("6765287379189104641")
            }
        }
    }

SECONDARY中学

    aireclaimRs:SECONDARY> rs.status()
    {
        "set" : "aireclaimRs",
        "date" : ISODate("2020-03-23T19:56:37.132Z"),
        "myState" : 2,
        "term" : NumberLong(20),
        "syncingTo" : "",
        "syncSourceHost" : "",
        "syncSourceId" : -1,
        "heartbeatIntervalMillis" : NumberLong(2000),
        "optimes" : {
            "lastCommittedOpTime" : {
                "ts" : Timestamp(0, 0),
                "t" : NumberLong(-1)
            },
            "appliedOpTime" : {
                "ts" : Timestamp(1584838063, 1),
                "t" : NumberLong(11)
            },
            "durableOpTime" : {
                "ts" : Timestamp(1584838063, 1),
                "t" : NumberLong(11)
            }
        },
        "members" : [
            {
                "_id" : 0,
                "name" : "vimax.aireclaim.com:27017",
                "health" : 1,
                "state" : 2,
                "stateStr" : "SECONDARY",
                "uptime" : 890,
                "optime" : {
                    "ts" : Timestamp(1584838063, 1),
                    "t" : NumberLong(11)
                },
                "optimeDate" : ISODate("2020-03-22T00:47:43Z"),
                "syncingTo" : "",
                "syncSourceHost" : "",
                "syncSourceId" : -1,
                "infoMessage" : "",
                "configVersion" : 3,
                "self" : true,
                "lastHeartbeatMessage" : ""
            },
            {
                "_id" : 1,
                "name" : "mx.aireclaim.com:27017",
                "health" : 0,
                "state" : 8,
                "stateStr" : "(not reachable/healthy)",
                "uptime" : 0,
                "optime" : {
                    "ts" : Timestamp(0, 0),
                    "t" : NumberLong(-1)
                },
                "optimeDurable" : {
                    "ts" : Timestamp(0, 0),
                    "t" : NumberLong(-1)
                },
                "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
                "lastHeartbeat" : ISODate("2020-03-23T19:56:31.106Z"),
                "lastHeartbeatRecv" : ISODate("2020-03-23T19:56:36.034Z"),
                "pingMs" : NumberLong(0),
                "lastHeartbeatMessage" : "Couldn't get a connection within the time limit",
                "syncingTo" : "",
                "syncSourceHost" : "",
                "syncSourceId" : -1,
                "infoMessage" : "",
                "configVersion" : -1
            }
        ],
        "ok" : 1,
        "operationTime" : Timestamp(1584838063, 1),
        "$clusterTime" : {
            "clusterTime" : Timestamp(1584993390, 1),
            "signature" : {
                "hash" : BinData(0,"zINnjtBKXZ14gJrxrUbT2zyurqQ="),
                "keyId" : NumberLong("6765287379189104641")
            }
        }
    }

This is what is happening in the logs:这是日志中发生的事情:

Primary基本的

tail /var/log/mongodb/mongodb.log
2020-03-23T21:08:09.420+0100 I NETWORK  [conn2630] end connection 144.76.84.5:50114 (2 connections now open)
2020-03-23T21:08:29.420+0100 I NETWORK  [listener] connection accepted from 144.76.84.5:50118 #2631 (3 connections now open)
2020-03-23T21:08:29.420+0100 I NETWORK  [conn2631] end connection 144.76.84.5:50118 (2 connections now open)
2020-03-23T21:08:49.420+0100 I NETWORK  [listener] connection accepted from 144.76.84.5:50122 #2632 (3 connections now open)
2020-03-23T21:08:49.420+0100 I NETWORK  [conn2632] end connection 144.76.84.5:50122 (2 connections now open)
2020-03-23T21:09:09.421+0100 I NETWORK  [listener] connection accepted from 144.76.84.5:50126 #2633 (3 connections now open)
2020-03-23T21:09:09.421+0100 I NETWORK  [conn2633] end connection 144.76.84.5:50126 (2 connections now open)
2020-03-23T21:09:18.717+0100 I NETWORK  [conn2629] end connection 127.0.0.1:60316 (1 connection now open)
2020-03-23T21:09:29.421+0100 I NETWORK  [listener] connection accepted from 144.76.84.5:50130 #2634 (2 connections now open)
2020-03-23T21:09:29.421+0100 I NETWORK  [conn2634] end connection 144.76.84.5:50130 (1 connection now open)

Secondary中学

tail /var/log/mongodb/mongod.log 
2020-03-23T20:48:13.408+0100 I REPL     [replexec-1] Not starting an election, since we are not electable due to: Not standing for election because I cannot see a majority (mask 0x1)
2020-03-23T20:48:17.592+0100 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 74) to mx.aireclaim.com:27017, response status: NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit
2020-03-23T20:48:19.658+0100 I REPL     [rsBackgroundSync] waiting for 2 pings from other members before syncing
2020-03-23T20:48:24.791+0100 I REPL     [replexec-1] Not starting an election, since we are not electable due to: Not standing for election because I cannot see a majority (mask 0x1)
2020-03-23T20:48:28.092+0100 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 76) to mx.aireclaim.com:27017, response status: NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit
2020-03-23T20:48:29.593+0100 I ASIO     [NetworkInterfaceASIO-Replication-0] Failed to connect to mx.aireclaim.com:27017 - NetworkInterfaceExceededTimeLimit: Operation timed out
2020-03-23T20:48:29.593+0100 I ASIO     [NetworkInterfaceASIO-Replication-0] Connecting to mx.aireclaim.com:27017
2020-03-23T20:48:34.661+0100 I REPL     [rsBackgroundSync] waiting for 2 pings from other members before syncing
2020-03-23T20:48:35.839+0100 I REPL     [replexec-1] Not starting an election, since we are not electable due to: Not standing for election because I cannot see a majority (mask 0x1)
2020-03-23T20:48:38.593+0100 I REPL_HB  [replexec-1] Error in heartbeat (requestId: 77) to mx.aireclaim.com:27017, response status: NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit

So it looks like the primary accepts the connection then ends it.所以看起来主要接受连接然后结束它。

You would think that there is some conectivity issue between the two machines, but I can :你会认为两台机器之间存在一些连接问题,但我可以:

aireclaim@aireclaim-platform-server:~> mongo mx.aireclaim.com
MongoDB shell version v3.6.10
connecting to: mongodb://mx.aireclaim.com:27017/test?gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("de193403-69ca-4f74-a6c5-ee384eb420d7") }
MongoDB server version: 3.6.3
aireclaimRs:PRIMARY> 

and:和:

root@mx  /home/aireclaim  mongo vimax.aireclaim.com
MongoDB shell version v3.6.3
connecting to: mongodb://vimax.aireclaim.com:27017/test
MongoDB server version: 3.6.10
aireclaimRs:SECONDARY> 

Also I can atest that both machines have port 27017 open and that they are configured to run on those Ips.此外,我可以证明两台机器都打开了端口 27017,并且它们被配置为在这些 Ips 上运行。

In fact the entire thing remains funcional somehow.事实上,整个事情在某种程度上仍然是功能性的。 The clients that atempt to connect to the replicaset manage to do it, but It looks as though the secondary is no longer replicating.尝试连接到副本集的客户端设法做到了,但看起来辅助节点不再复制。

Any insight would be a godsend.任何洞察力都将是天赐之物。

将 mongodb 版本更新到 4.0.5 解决了这个问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM