简体   繁体   English

使用 c# 读取并更新二进制文件中记录的最佳方法是什么

[英]What is the best way to read and then update record in a binary file with c#

I'm trying to edit some records in a binary file, but I just can't seem to get the hang of it.我正在尝试编辑二进制文件中的一些记录,但我似乎无法掌握它。

I can read the file, but than I can't find the position where I want the record to edit, so I can replace.我可以阅读该文件,但我找不到想要编辑记录的 position,所以我可以替换。

This is my code so far:到目前为止,这是我的代码:

public MyModel Put(MyModel exMyModel)
{
        List<MyModel> list = new List<MyModel>();

        try
        {
            IFormatter formatter = new BinaryFormatter();

            using (Stream stream = new FileStream(_exMyModel, FileMode.Open, FileAccess.Read, FileShare.Read))
            {
                while (stream.Position < stream.Length)
                {
                    var obj = (MyModel)formatter.Deserialize(stream);
                    list.Add(obj);
                }
            
                MyModel mymodel = list.FirstOrDefault(i => i.ID == exMyModel.ID);
                mymodel.FirstName = exMyModel.FirstName;
                mymodel.PhoneNumber = exMyModel.PhoneNumber;
                
                // Now I want to update the current record with this new object
                // ... code to update
            }

            return phoneBookEntry;
        }
        catch (Exception ex)
        {
            Console.WriteLine("The error is " + ex.Message);
            return null;
        }
}

I'm really stuck here guys.我真的被困在这里了,伙计们。 Any help would be appreciated.任何帮助,将不胜感激。

I already checked these answers:我已经检查了这些答案:

Thank you in advance:)先感谢您:)

I would recommend just writing all objects back to the stream.我建议将所有对象写回 stream。 You could perhaps just write the changed object and each after it, but I would not bother.您也许可以只写更改后的 object 和每个,但我不会打扰。

Start by resetting the stream: stream.Position = 0 .首先重置 stream: stream.Position = 0 You can then write a loop an serialize each object using formatter.Serialize(stream, object)然后,您可以编写一个循环,使用formatter.Serialize(stream, object)序列化每个 object

If this is a coding task I guess you have no choice in the matter.如果这是一项编码任务,我猜你在这件事上别无选择。 But you should know that BinaryFormatter has various problems.但是你应该知道 BinaryFormatter 有各种各样的问题。 It more or less saves the objects the same way they are stored in memory.它或多或少地保存对象的方式与它们存储在 memory 中的方式相同。 This is inefficient, insecure, and changes to the classes may prevent you from deserializing stored objects.这是低效、不安全的,并且对类的更改可能会阻止您反序列化存储的对象。 The most common serialization method today is json , but there are also binary alternatives like protobuf.net .今天最常见的序列化方法是json ,但也有像protobuf.net这样的二进制替代方法。

How you update the file is going to rely pretty heavily on whether or not your records serialize as fixed length.您如何更新文件将在很大程度上取决于您的记录是否序列化为固定长度。

Variable-Length Records可变长度记录

Since you're using strings in the record then any change in string length (as serialized bytes) or anything other change that affects the length of the serialized object will make it impossible to do an in-place update of the record.由于您在记录中使用字符串,因此字符串长度(作为序列化字节)的任何更改或影响序列化 object 长度的任何其他更改都将导致无法就地更新记录。

With that in mind you're going to have to do some extra work.考虑到这一点,您将不得不做一些额外的工作。

First, test the objects inside the read loop.首先,测试读取循环内的对象。 Capture current position before you deserialize each object, test the object for equivalence, save the offset when you find the record you're looking for then deserialize the rest of the objects in the stream... or copy the rest of the stream to a MemoryStream instance for later. Capture current position before you deserialize each object, test the object for equivalence, save the offset when you find the record you're looking for then deserialize the rest of the objects in the stream... or copy the rest of the stream to a MemoryStream实例供以后使用。

Next, set stream.Position and stream.Length equal to the start position of the record you're updating, truncating the file. Next, set stream.Position and stream.Length equal to the start position of the record you're updating, truncating the file. Serialize the new copy of the record into the stream, then copy the MemoryStream that holds the rest of the records back into the stream... or capture and serialize the rest of the objects. Serialize the new copy of the record into the stream, then copy the MemoryStream that holds the rest of the records back into the stream... or capture and serialize the rest of the objects.

In other words (untested but showing the general structure):换句话说(未经测试但显示一般结构):

public MyModel Put(MyModel exMyModel)
{
    try
    {
        IFormatter formatter = new BinaryFormatter();
        using (Stream stream = File.Open(_exMyModel))
        using (var buffer = new MemoryStream())
        {
            long location = -1;
            while (stream.Position < stream.Length)
            {
                var position = stream.Position;
                var obj = (MyModel)formatter.Deserialize(stream);
                if (obj.ID == exMyModel.ID)
                {
                    location = position;
                    stream.CopyTo(buffer);
                    buffer.Position = 0;
                    stream.Position = stream.Length = position;
                }
            }
            formatter.Serialize(stream);
            if (location > 0 && buffer.Length > 0)
            {
                buffer.CopyTo(stream);
            }
        }
        return phoneBookEntry;
    }
    catch (Exception ex)
    {
        Console.WriteLine("The error is " + ex.Message);
        return null;
    }
}

Note that in general a MemoryStream holding the serialized data will be faster and take less memory than deserializing the records and then serializing them again.请注意,与反序列化记录然后再次序列化它们相比,通常保存序列化数据的MemoryStream会更快并且需要更少的 memory。

Static-Length Records静态长度记录

This is unlikely, but in the case that your record type is annotated in such a way that it always serializes to the same number of bytes then you can skip everything to do with the MemoryStream and truncating the binary file.这不太可能,但如果您的记录类型以这样的方式进行注释,它总是序列化为相同数量的字节,那么您可以跳过与MemoryStream相关的所有内容并截断二进制文件。 In this case just read records until you find the right one, rewind the stream to that position (after the read) and write a new copy of the record.在这种情况下,只需读取记录直到找到正确的记录,将 stream 倒回到 position (读取后)并写入记录的新副本。

You'll have to examine the classes yourself to see what sort of serialization modifier attributes are on the string properties, and I'd suggest testing this extensively with different string values to ensure that you're actually getting the same data length for all of them.您必须自己检查类以查看字符串属性上的序列化修饰符属性的类型,我建议使用不同的字符串值对此进行广泛测试,以确保您实际上获得所有相同的数据长度他们。 Adding or removing a single byte will screw up the remainder of the records in the file.添加或删除单个字节将搞砸文件中的其余记录。

Edge Case - Same Length Strings边缘案例 - 相同长度的字符串

Since replacing a record with data that's the same length only requires an overwrite, not a rewrite of the file, you might get some use out of testing the record length before grabbing the rest of the file.由于用相同长度的数据替换记录只需要覆盖而不是重写文件,因此在获取文件的 rest 之前测试记录长度可能会有一些用处。 If you get lucky and the modified record is the same length then just seek back to the right position and write the data in-place.如果幸运并且修改后的记录长度相同,则只需返回右侧 position 并就地写入数据。 That way if you have a file with a ton of records in it you'll get a much faster update whenever the length is the same.这样,如果您有一个包含大量记录的文件,只要长度相同,您将获得更快的更新。

Changing Format...改变格式...

You said that this is a coding task so you probably can't take this option, but if you can alter the storage format... let's just say that BinaryFormatter is definitely not your friend.您说这是一项编码任务,因此您可能无法选择此选项,但是如果您可以更改存储格式……我们就说BinaryFormatter绝对不是您的朋友。 There are much better ways to do it if you have the option.如果你有选择的话,还有更好的方法来做到这一点。 SQLite is my binary format of choice:) SQLite 是我选择的二进制格式:)

Actually, since this appears to be a coding test you might want to make a point of that.实际上,由于这似乎是一个编码测试,您可能想要指出这一点。 Write the code they asked for, then if you have time write a better format that doesn't rely on BinaryFormatter , or throw SQLite at the problem.编写他们要求的代码,然后如果您有时间编写不依赖于BinaryFormatter的更好格式,或者在问题上抛出 SQLite。 Using an ORM like LinqToDB makes SQLite trivial.使用像 LinqToDB 这样的 ORM 使得 SQLite 变得微不足道。 Explain to them that the file format they're using is inherently unstable and should be replaced with something that is both stable, supported and efficient.向他们解释他们使用的文件格式本质上是不稳定的,应该用稳定、受支持和高效的文件格式替换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM