简体繁体 English

MSR Orleans - 如何使用并行读取创建读写器粒度

[英]MSR Orleans - How to create a reader-writer grain with parallel reads

原文 2018-01-08 06:48:52 8 2 c#/ multithreading/ parallel-processing/ orleans

I need a reader-writer grain(s?) to hold some values so other parts of the system can reference them frequently and in parallel. 我需要一个读写器粒子（s？）来保存一些值，以便系统的其他部分可以频繁地并行地引用它们。

What I'm after is storing some system-wide config values which are accessed frequently, and subject to change, but only extremely rarely (once a month or so at most). 我所追求的是存储一些系统范围的配置值，这些配置值经常被访问，并且可能会发生变化，但非常罕见（最多每月一次）。 The system should be reconfigurable without downtime. 系统应该可以重新配置，无需停机。 What I'm currently considering is to store the data in some database. 我目前正在考虑的是将数据存储在某个数据库中。 Then it will be read at silo startup, and there will be a special callback to read the data again after it changes externally. 然后它将在silo启动时读取，并且会有一个特殊的回调函数，在外部更改后再次读取数据。 I don't want to read the data from the database every time I need it because: 我不想每次需要时都从数据库中读取数据，因为：

It'll create unnecessary processing overhead inside the silo, as some of the data must be processed and filtered out. 它会在孤岛内部创建不必要的处理开销，因为必须处理和过滤掉一些数据。
It'll increase load on the database, which I cannot guarantee will be as good at handling high load as the silo environment. 它会增加数据库的负载，我不能保证在处理高负载时会像silo环境一样好。
The data is verified by the silo environment before it is updated. 在更新之前，数据由筒仓环境进行验证。 Reading directly from the database means there will be no middle layer to hold the last known valid data while the operator updates and fixes the new data. 直接从数据库中读取意味着当操作员更新并修复新数据时，将没有中间层来保存最后已知的有效数据。

I can easily create a reader-writer locked in-memory data store, but Orleans' single-threaded execution policy doesn't allow parallel access to the grain that holds the data. 我可以轻松地创建一个读写器锁定的内存数据存储，但奥尔良的单线程执行策略不允许并行访问保存数据的粒度。 I can think of the following ways to bypass this: 我可以想到以下几种方法来绕过这个：

Have multiple copies of the data inside multiple grains. 在多个谷物中有多个数据副本。 This is obviously not optimal. 这显然不是最佳的。
Use static fields to store the data and make the grain a stateless worker. 使用静态字段存储数据并使谷物成为无状态工作者。 This means every silo has its own copy of the data (which also helps with reducing network load), but there is no means of asking every silo to update its copy of the data (that I know of at least). 这意味着每个孤岛都有自己的数据副本（这也有助于减少网络负载），但是没有办法要求每个孤岛更新其数据副本（我至少知道）。

Suggestions? 建议？

2 个解决方案

Have you looked at something like the Smart cache pattern ? 您是否看过类似智能缓存模式的内容？

Perhaps using a Reentrant grain could help as well, this will allow method calls to interleave. 也许使用Reentrant grain也可以提供帮助，这将允许方法调用交错。

I found this issue on github asking for the same. 我在github上发现了这个问题。

We found a solution that doesn't require timer-based updates over on GitHub . 我们在GitHub上找到了一个不需要基于计时器的更新的解决方案。 I'll detail the solution here: 我将详细解释这里的解决方案：

There is a master grain, responsible for reading data from the database. 有一个主谷物，负责从数据库中读取数据。 This grain also receives external UpdateConfig calls when the config changes. 当配置更改时，此粒度还会接收外部UpdateConfig调用。 The database is updated 'manually', so I don't care if it doesn't pick updates up automatically. 数据库是“手动”更新的，所以我不在乎它是不是自动选择更新。
There are stateless worker cache grains on every silo. 每个筒仓上都有无国籍的工人缓存谷物。 These grains store the data in static objects (to help with memory footprint) and use a reader-writer lock to manage access to the data. 这些粒子将数据存储在静态对象中（以帮助填充内存）并使用读写器锁来管理对数据的访问。 When the first one wakes up on a silo, it updates the data and the rest just use that data. 当第一个在筒仓上醒来时，它会更新数据，其余的只是使用该数据。 They also support an external UpdateConfig call which will in turn ask the master grain for new data. 它们还支持外部UpdateConfig调用，该调用将依次向主粒子请求新数据。
There is a bootstrap provider which does two things. 有一个引导程序提供程序，它做两件事。 First, it wakes the master grain up at init time so data is available before it is needed and thus we can avoid lazy init hiccups. 首先，它在初始化时唤醒主粒度，因此数据在需要之前可用，因此我们可以避免延迟初始化打嗝。 Second, the provider also supports a control command (see here for an example of controllable providers). 其次，提供程序还支持控制命令（有关可控提供程序的示例，请参见此处）。 Upon receiving this command, it will ask the first available cache grain (guaranteed to be local) to update the static objects. 收到此命令后，它将询问第一个可用的缓存粒度（保证是本地的）以更新静态对象。 This way, data is updated within each silo. 这样，数据就会在每个孤岛中更新。 The master grain simply sends this control command whenever it updates its data. 主粒度只是在更新其数据时发送此控制命令。