简体   繁体   English

没有结构元数据本身的C#序列化

[英]C# serialization without struct metadata itself

Right now I'm working on a game engine. 现在,我正在开发游戏引擎。 To be more efficient and keep data from the end user, I'm trying to use serialization on a modified form of the Wavefront's *.OBJ format. 为了更有效率,并从最终用户保持数据,我想在波前的* .OBJ格式的修改形式使用序列。 I have multiple structs set up to represent data, and the serialization of the objects works fine except it takes up a significant amount of file space (at least x5 that of the original OBJ file). 我设置了多个结构来表示数据,并且对象的序列化工作正常,只不过它占用了大量的文件空间(至少是原始OBJ文件的x5)。

To be specific, here's a quick example of what the final object would be (in a JSON-esque format): 具体来说,这是最终对象的简要示例(采用JSON样式):

{
    [{float 5.0, float 2.0, float 1.0}, {float 7.0, float 2.0, float 1.0}, ...]
    // ^^^ vertex positions
    // other similar structures for colors, normals, texture coordinates
    // ...

    [[{int 1, int 1, int 1}, {int 2, int 2, int 1}, {int 3, int 3, int 2}], ...]
    //represents one face; represents the following
    //face[vertex{position index, text coords index, normal index}, vertex{}...]
}

Basically, my main issue with the method of serializing data (binary format) is it saves the names of the structs, not the values. 基本上,我的序列化数据方法(二进制格式)的主要问题是它保存了结构的名称,而不是值。 I'd love to keep the data in the format I have already, just without saving the struct itself in my data. 我很乐意将数据保留为现有格式,只是不将结构本身保存在数据中。 I want to save something similar to the above, yet it'll still let me recompile with a different struct name later. 我想保存类似于上面的内容,但是稍后仍然可以让我使用其他结构名称重新编译。

Here's the main object I'm serializing and saving to a file: 这是我要序列化并保存到文件的主要对象:

[Serializable()] //the included structs have this applied
public struct InstantGameworksObjectData
{
    public Position[] Positions;
    public TextureCoordinates[] TextureCoordinates;
    public Position[] Normals;

    public Face[] Faces;
}

Here's the method in which I serialize and save the data: 这是我序列化并保存数据的方法:

IFormatter formatter = new BinaryFormatter();

long Beginning = DateTime.Now.Ticks / 10000000;
foreach (string file in fileNames)
{
    Console.WriteLine("Begin " + Path.GetFileName(file));

    var output = InstantGameworksObject.ConvertOBJToIGWO(File.ReadAllLines(file));

    Console.WriteLine("Writing file");

    Stream fileOutputStream = new FileStream(outputPath + @"\" + Path.GetFileNameWithoutExtension(file) + ".igwo", FileMode.Create, FileAccess.Write, FileShare.None);
    formatter.Serialize(fileOutputStream, output);


    Console.WriteLine(outputPath + @"\" + Path.GetFileNameWithoutExtension(file) + ".igwo");
}

The output, of course, is in binary/hex (based on what program you use to view the file), and that's great: 当然,输出为二进制/十六进制(基于您用来查看文件的程序),这很棒:

输出文本文件

But putting it into a hex-to-text converter online yields specific name data: 但是将其在线放入十六进制到文本转换器中会产生特定的名称数据:

从原始转换

In the long run, this could mean gigabytes worth of useless data. 从长远来看,这可能意味着数千兆字节的无用数据。 How can I save my C# object with the data in the correct format, just without the extra meta-clutter? 如何在没有额外的元混乱的情况下以正确的格式将C#对象与数据一起保存?

As you correctly note, the standard framework binary formatters include a host of metadata about the structure of the data. 正如您正确指出的那样,标准框架二进制格式化程序包括大量有关数据结构的元数据。 This is to try to keep the serialised data self-describing. 这是为了尝试保持序列化数据的自我描述。 If they were to separate the data from all that metadata, then the smallest change to the structure of classes would render the previously serialised data useless. 如果他们要从所有元数据中分离出数据,那么对类结构的最小更改将使先前序列化的数据无用。 By that token, I doubt you'd find any standard framework method of serialising binary data that didn't include all the metadata. 出于这种原因,我怀疑您会找到任何不包含所有元数据的序列化二进制数据的标准框架方法。

Even ProtoBuf includes the semantics of the data in the file data, albeit with less overhead. 即使ProtoBuf包含的数据语义也包含在文件数据中,尽管开销较小。

Given that the structure of your data follows the reasonably common and well established form of 3D object data, you could roll your own format for your assets which strips the semantics and only stores the raw data. 鉴于您的数据结构遵循合理通用和完善的3D对象数据形式,您可以为资产滚动自己的格式,以剥离语义并仅存储原始数据。 You can implement read and write methods easily using the BinaryReader/BinaryWriter classes (which would be my preference). 您可以使用BinaryReader / BinaryWriter类轻松实现读写方法(这是我的偏爱)。 If you're looking to obfuscate data from the end user, there are a variety of different ways that you could achieve that with this approach. 如果您希望混淆最终用户的数据,则可以使用多种方法来实现此目标。

For example: 例如:

public static InstantGameworksObjectData ReadIgoObjct(BinaryReader pReader)
{
    var lOutput = new InstantGameworksObjectData();

    int lVersion = pReader.ReadInt32();     // Useful in case you ever want to change the format

    int lPositionCount = pReader.ReadInt32();   // Store the length of the Position array before the data so you can pre-allocate the array.
    lOutput.Positions = new Position[lPositionCount];
    for ( int lPositionIndex = 0 ; lPositionIndex < lPositionCount ; ++ lPositionIndex )
    {
        lOutput.Positions[lPositionIndex] = new Position();
        lOutput.Positions[lPositionIndex].X = pReader.ReadSingle();
        lOutput.Positions[lPositionIndex].Y = pReader.ReadSingle();
        lOutput.Positions[lPositionIndex].Z = pReader.ReadSingle();
        // or if you prefer...  lOutput.Positions[lPositionIndex] = Position.ReadPosition(pReader);
    }

    int lTextureCoordinateCount = pReader.ReadInt32();
    lOutput.TextureCoordinates = new TextureCoordinate[lPositionCount];
    for ( int lTextureCoordinateIndex = 0 ; lTextureCoordinateIndex < lTextureCoordinateCount ; ++ lTextureCoordinateIndex )
    {
        lOutput.TextureCoordinates[lTextureCoordinateIndex] = new TextureCoordinate();
        lOutput.TextureCoordinates[lTextureCoordinateIndex].X = pReader.ReadSingle();
        lOutput.TextureCoordinates[lTextureCoordinateIndex].Y = pReader.ReadSingle();
        lOutput.TextureCoordinates[lTextureCoordinateIndex].Z = pReader.ReadSingle();
        // or if you prefer...  lOutput.TextureCoordinates[lTextureCoordinateIndex] = TextureCoordinate.ReadTextureCoordinate(pReader);
    }

    // ...
}

As far as space efficiency and speed goes, this approach is hard to beat. 就空间效率和速度而言,这种方法很难被击败。 However, this works well for the 3D objects as they're fairly well-defined and the format is not likely to change, but this approach may not extend well to the other assets that you want to store. 但是,这对于3D对象定义得很好,并且格式不太可能更改,因此效果很好,但是这种方法可能无法很好地扩展到要存储的其他资产。

If you find you are needing to change class structures frequently, you may find you have to write lots of if-blocks based on version to correctly read a file, and have to regularly debug issues where the data in the file is not quite in the format you expect. 如果发现需要经常更改类结构,则可能会发现必须根据版本编写大量的if块才能正确读取文件,并且必须定期调试文件中数据不完全存在的问题。您期望的格式。 A happy medium might be to use something such as ProtoBuf for the bulk of your development until you're happy with the structure of your data object classes, and then writing raw binary Read/Write methods for each of them before you release. 一个快乐的媒介可能是在整个开发过程中使用诸如ProtoBuf之类的东西,直到对数据对象类的结构满意为止,然后在发布之前为它们中的每一个编写原始的二进制Read / Write方法。

I'd also recommend some Unit Tests to ensure that your Read and Write methods are correctly persisting the object to avoid pulling your hair out later. 我还建议您进行一些单元测试,以确保您的读取和写入方法正确地保留了该对象,以避免以后再拔头发。

Hope this helps 希望这可以帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM