简体   繁体   English

我应该使用哪个集合从多个线程中读取元素并定期完全覆盖集合?

[英]Which collection should I use it to read elements from multiple threads and full overwrite collection periodically?

I'm going to use a static collection that will be used for reading by the core process and fully updated every X mins by the background service.我将使用一个静态集合,它将被核心进程用于读取,并由后台服务每 X 分钟完全更新一次。

The background process will load updated data from the database every X mins and set the received dataset into this static collection.后台进程将每 X 分钟从数据库加载更新的数据,并将接收到的数据集设置到这个静态集合中。

The core process will receive many tasks to check if some values exist in this collection.核心进程将接收许多任务来检查此集合中是否存在某些值。 Each task will be processed in a separate thread.每个任务将在一个单独的线程中处理。 There will be a lot of requests, it should be extremely fast, so I can't ask database for each request and I need an updateable list in memory.会有很多请求,它应该非常快,所以我不能为每个请求询问数据库,我需要一个可更新的内存列表。

public class LoadedData
{
    public static HashSet<string> Keys { get; set; }
}

public class CoreProcess
{
    public bool ElementExists(string key)
    {
        return LoadedData.Keys.Contains(key);
    }
}

public class BackgroundProcess
{
    public async Task LoadData()
    {
        while (true)
        {
            LoadedData.Keys = GetKeysFromDb();
            await Task.Delay(TimeSpan.FromMinutes(5));
        }
    }
}

So, I'm looking for the best solution for this.因此,我正在为此寻找最佳解决方案。 I was thinking about using HashSet<T> because I'm sure that each element in the collection will be unique.我正在考虑使用HashSet<T>因为我确信集合中的每个元素都是唯一的。 But HashSet<T> is not thread-safe.但是HashSet<T>不是线程安全的。 So I started considering BlockingCollection<T> , ConcurrentBag<T> , ConcurrentDictionary<T, byte> , but then I wondered if I needed a thread-safe collection here at all.所以我开始考虑BlockingCollection<T>ConcurrentBag<T>ConcurrentDictionary<T, byte> ,但后来我想知道我是否需要一个线程安全的集合。 Looks like not, because I'm not going to add/update/remove particular elements in the collection.看起来不是,因为我不会在集合中添加/更新/删除特定元素。 Only full rewrite from the database.仅从数据库完全重写。

  1. So, does it mean that I can just use simple HashSet<T> ?那么,这是否意味着我可以使用简单的HashSet<T>

  2. Which collection would you use to solve it?您将使用哪个集合来解决它?

  3. And in general, will there be any issues with a simultaneous reading by the core process and full overwriting of the collection by the background process?而且一般来说,核心进程同时读取,后台进程完全覆盖集合会不会有什么问题?

So the HashSet<string> becomes effectively immutable as soon as it becomes the value of the LoadedData.Keys property.因此,一旦HashSet<string>成为LoadedData.Keys属性的值,它就会变得有效地不可变。 In this case your code is almost OK.在这种情况下,您的代码几乎可以。 The only missing ingredient is to ensure the visibility of this property by all threads involved.唯一缺少的成分是确保所有相关线程都能看到此属性。

In theory it is possible that the compiler or the jitter might use a cached/stale value of the property, instead of looking what is currently stored in the main memory. 从理论上讲,编译器或抖动可能会使用属性的缓存/陈旧值,而不是查看当前存储在主存储器中的内容。 In practice you might never experience this phenomenon, but if you want to play by the rules you must read and write to this property with volatile semantics.在实践中,您可能永远不会遇到这种现象,但如果您想按照规则进行游戏,则必须使用 volatile 语义读取和写入此属性。 If the Keys was a field , you could just decorate it with the volatile keyword.如果Keys是一个field ,你可以用volatile关键字来装饰它。 Since it's a property, you must do a bit more work:既然它是一个属性,你必须做更多的工作:

public class LoadedData
{
    private volatile static HashSet<string> _keys;

    public static HashSet<string> Keys
    {
        get => _keys;
        set => _keys = value;
    }
}

...or using the Volatile class instead of the volatile keyword: ...或使用Volatile类而不是volatile关键字:

public class LoadedData
{
    private static HashSet<string> _keys;

    public static HashSet<string> Keys
    {
        get => Volatile.Read(ref _keys);
        set => Volatile.Write(ref _keys, value);
    }
}

A final cautionary note: The immutability of the HashSet<string> is not enforced by the compiler.最后的警告:编译器不强制HashSet<string>的不变性。 It's just a verbal contract that you make with your future self, and with any other future maintainers of your code.这只是您与未来的自己以及与您的代码的任何其他未来维护者签订的口头合同。 In case some mutative code find its way to your code-base, the behavior of your program will become officially undefined.万一某些可变代码进入您的代码库,您的程序的行为将变得正式未定义。 If you want to guard yourself against this scenario, the most semantically correct way to do it is to replace the HashSet<string> with an ImmutableHashSet<string> .如果您想防范这种情况,最语义上正确的方法是将HashSet<string>替换为ImmutableHashSet<string> The immutable collections are significantly slower than their mutable counterparts (typically at least 10x slower), so it's a trade-off. 不可变集合比它们的可变集合慢得多(通常至少慢 10 倍),所以这是一个权衡。 You can have peace of mind, or ultimate performance, but not both.您可以安心,也可以拥有极致性能,但不能兼而有之。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM