简体繁体 English

Service Fabric 群集状态“升级服务无法访问”

[英]Service Fabric Cluster status “Upgrade service unreachable”

原文 2017-05-30 08:39:18 0 2 azure/ azure-service-fabric

I Had SF cluster made of 3 Standard A0 nodes.我有 SF 集群由 3 个标准 A0 节点组成。 I scaled cluster in to 1 node and understood that this was bad idea because nothing was working in this state (even SF explorer was not working) Then I scaled it out back to 3 nodes and restarted Primary scaleser.我将集群扩展到 1 个节点，并明白这是个坏主意，因为在这种状态下没有任何工作（即使 SF explorer 也不工作）然后我将其扩展回 3 个节点并重新启动主缩放器。 Now all nodes in scaleset are up and running but SF cluster status is "Upgrade service unreachable".现在 scaleset 中的所有节点都已启动并运行，但 SF 集群状态为“升级服务无法访问”。 I saw similar question Service Fabric Status: Upgrade service unreachable where was recommended to scale nodes up to D2 but this hasn't solve my problem.我看到了类似的问题Service Fabric Status: Upgrade service unreachable where 被推荐将节点扩展到 D2 但这并没有解决我的问题。 I have connected to one node via RDP and are some Event logs:我已经通过 RDP 连接到一个节点，并且是一些事件日志：

EventLog -> Applications and Service Logs -> Microsoft Service Fabric -> Operational:事件日志 -> 应用程序和服务日志 -> Microsoft Service Fabric -> 操作：

Node name: _SSService_0 has failed to open with upgrade domain: 0, fault domain: fd:/0, address: 10.0.0.4, hostname: SSService000000, isSeedNode: true, versionInstance: 5.6.210.9494:3, id: d9e8bae2d4d8116bfefb989b95e91f7b, dca instance: 131405546580494698, error: FABRIC_E_TIMEOUT节点名称：_SSService_0 无法打开，升级域：0，故障域：fd：/0，地址：10.0.0.4，主机名：SSService000000，isSeedNode：true，versionInstance：5.6.210.9494:3，id：d9e8bae2d4d9b7d961cfb7d816c ：131405546580494698，错误：FABRIC_E_TIMEOUT

EventLog -> Applications and Service Logs -> Microsoft Service Fabric -> Admin:事件日志 -> 应用程序和服务日志 -> Microsoft Service Fabric -> 管理员：

client-10.0.0.4:19000/10.0.0.4:19000: error = 2147943625, failureCount=487.客户端 10.0.0.4:19000/10.0.0.4:19000：错误 = 2147943625，失败计数 = 487。 Filter by (type~Transport.St && ~"(?i)10.0.0.4:19000") to get listener lifecycle.按 (type~Transport.St && ~"(?i)10.0.0.4:19000") 过滤以获取侦听器生命周期。 Connect failure is expected if listener was never started, or listener/its process was stopped before/during connecting.如果侦听器从未启动，或者侦听器/它的进程在连接之前/期间停止，则预期连接失败。

2 个解决方案

If you are scaling down the cluster by resizing VM scale set to 1 you're basically destroying the cluster because it requires a minimum of 3 nodes by design.如果您通过将 VM 规模设置为 1 来缩减集群，您基本上是在破坏集群，因为它在设计上至少需要 3 个节点。 Therefore the only way is to recreate it again from scratch.因此，唯一的方法是从头开始重新创建它。

If you need a tiny cluster consisting of just 1 node (like for testing purposes) there is a way in Azure now to create a single node cluster, but you won't be able to scale it as it's a special case not for production use.如果您需要一个仅由 1 个节点组成的小集群（例如用于测试目的），现在 Azure 中有一种方法可以创建单节点集群，但您将无法扩展它，因为它是一种特殊情况，不适用于生产用途.

Upgrade service unreachable this happens if the number of active VM or node of the cluster become 0 anyhow.如果活动 VM 或集群节点的数量无论如何变为 0，就会发生升级服务无法访问的情况。 In my case, his happened by restarting all the VM at a time.就我而言，他是通过一次重新启动所有 VM 来发生的。 In this state, the nodes are available and running but they have been disconnected from the cluster.在此状态下，节点可用且正在运行，但它们已与集群断开连接。

I resolved this, by deallocating and restarting the node from Virtual machine Scale set .我通过从Virtual machine Scale set解除分配并重新启动节点解决了这个问题。