简体繁体 English

在Microsoft Orleans中自动重试谷物故障转移

[英]Automatic Retry on grain fail-over in Microsoft Orleans

原文 2016-08-23 09:21:52 4 1 c#/ .net/ actor/ orleans

so we're testing out how Grain Fail-over works in case some silo is unresponsive for some reason (server is down, etc...). 因此，我们正在测试谷物故障转移的工作原理，以防某些筒仓由于某种原因（服务器关闭等）而无响应。 Currently we have two silos running on two different machines with grains activated on each of them. 当前，我们在两个不同的机器上运行两个筒仓，每个筒仓上都激活了谷物。 We then proceed to kill one of the silos (unexpectedly). 然后，我们继续杀死一个筒仓（出乎意料）。 We try to call a grain on the DEAD silo and after some timeout (I believe it is 3 minutes or so in total) an Exception is thrown, stating that the Silo is rejecting the connection. 我们尝试在DEAD筒仓上调用谷物，并在超时（我认为总共3分钟左右）后，抛出异常，表明筒仓正在拒绝连接。 Now, we believe that after a silo is declared DEAD a grain is only activated on another silo if we retry the activation. 现在，我们相信在将筒仓声明为DEAD后，如果我们重试激活，则仅在另一个筒仓上激活谷物。 This is working fine for us. 这对我们来说很好。 However, we would like to know if there is some way to do the retry automatically instead of doing the logic ourselves. 但是，我们想知道是否有某种方法可以自动执行重试，而不是自己执行逻辑。

1 个解决方案

First, the 3 minutes sounds way too much. 首先，3分钟听起来太多了。 It should be tens of a second, if you are using the default liveness settings. 如果您使用默认的活动设置，则应为数十秒。 What system store is it? 这是什么系统存储？

If you want to automatically retry, you better wrap all your client grain calls in a wrapper which will retry with exponential back off for example. 如果要自动重试，则最好将所有客户端粒度调用包装在包装器中，该包装器将使用例如指数补偿的方式重试。 Doing retries yourself gives you much more control, and what to retry and how. 自己进行重试可以使您拥有更多的控制权，以及可以重试的内容和方法。