简体   繁体   English

C#中的高效小字节数组

[英]Efficient small byte-arrays in C#

I have a huge collection of very small objects. 我有很多非常小的物件。 To ensure the data is stored very compactly I rewrote the class to store all information within a byte-array with variable-byte encoding. 为了确保数据存储得非常紧凑,我重写了该类以使用可变字节编码将所有信息存储在字节数组中。 Most instances of these millions of objects need only 3 to 7 bytes to store all the data . 这些数百万个对象的大多数实例仅需要3到7个字节即可存储所有数据

After memory-profiling I found out that these byte-arrays always take at least 32 bytes . 进行内存分析后,我发现这些字节数组始终至少占用32个字节

Is there a way to store the information more compactly than bit-fiddled into a byte[]? 有没有一种方法可以比位拼凑成byte []更紧凑地存储信息? Would it be better to point to an unmanaged array? 指向非托管数组会更好吗?

class MyClass
{
    byte[] compressed;

    public MyClass(IEnumerable<int> data)
    {
        compressed = compress(data);
    }

    private byte[] compress(IEnumerable<int> data)
    {
        // ...
    }

    private IEnumerable<int> decompress(byte[] compressedData)
    {
        // ...
    }

    public IEnumerable<int> Data { get { return decompress(compressed); } }
}

There are a couple problems you're facing that eat up memory. 您面临着一些消耗内存的问题。 One is object overhead, and the other is objects aligning to 32 or 64 bit boundaries (depending on your build). 一种是对象开销,另一种是对象对齐到32位或64位边界(取决于您的构建)。 Your current approach suffers from both issues. 您当前的方法会遇到这两个问题。 The following sources describe this in more detail: 以下资源对此进行了更详细的描述:

I played around with this when I was fiddling with benchmarking sizes . 当我摆弄基准尺寸时,我在玩这个游戏。

A solution that is simple would be to simply create a struct that has a single member that is a long value. 一个简单的解决方案是简单地创建一个结构,该结构具有一个长值的单个成员。 Its methods would handle packing and unpacking bytes into and out of that long, using shift and mask bit fiddling. 它的方法将使用移位和掩码位摆弄来处理字节的打包和拆包。

Another idea would be a class that served up objects by ID, and stored the actual bytes in a single backing List<byte> . 另一个想法是一个类,通过ID为对象提供服务,并将实际字节存储在单个后备List<byte> But this would get complicated and messy. 但这会变得复杂和混乱。 I think the struct idea is much more straightforward. 我认为结构的想法要简单得多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM