简体   繁体   中英

Using memory mapped file to compensate for out of memory in C#

I have a problem - previously I was after an algorithm to solve a part of it (see Combine LINQ queries ) anyway, and I have come to a huge issue.

At around 540k directories, it's crashing out with out of memory. :(

I am trying to process and store the company SAN file information, and we need to do this, because we have people who keep data for 25 years and they don't need to, but it's hard to track. It's a total of up to 70 TB of files. So, as you can imagine, it's a lot of files.

From what I've read however, memory mapped files can't be dynamic? Is this true? I can't know prior how many files + directories there are for sure.

If not, (please say not), can someone do me a short example on how to make a dynamic mapped file (code provided in the Combine LINQ queries question). In short, I create a directory structure in memory holding directory → directories + files(name, size, access date, modified date, and creation date).

Any clues would be appreciated as this would get around my problem if it's possible.

When you can't fit the whole thing into memory you can stream your data with an IEnumerable Below's an example of that. I've been playing around with MemoryMapped files as well since I need the last drop of perf, but so far I've stuck with BinaryReader/Writer.

For the DB advocates: When you really need the last drop of perf, I do my own binary files as well. Going out of process to a DB really adds overhead. Also the whole security/ logging, ACID etc does add up.

Here's an example that streams your f_results class.

EDIT

Updated example to show how to write/read a tree of directory info. I keep 1 file that holds all the directories. This tree is loaded into memory in one go, and then points to the files where all the f_results are. You still have to create a seperate file per directory that holds the f_results for all the files. How to do that depends on your code, but you should be able to figure that out.

Good luck!

public class f_results {
    public String name { get; set; }
    public DateTime cdate { get; set; }
    public DateTime mdate { get; set; }
    public DateTime adate { get; set; }
    public Int64 size { get; set; }

    // write one to a file
    public void WriteTo(BinaryWriter wrtr) {
        wrtr.Write(name);
        wrtr.Write(cdate.Ticks);
        wrtr.Write(mdate.Ticks);
        wrtr.Write(adate.Ticks);
        wrtr.Write(size);
    }

    // read one from a file
    public f_results(BinaryReader rdr) {
        name = rdr.ReadString();
        cdate = new DateTime(rdr.ReadInt64());
        mdate = new DateTime(rdr.ReadInt64());
        adate = new DateTime(rdr.ReadInt64());
        size = rdr.ReadInt64();
    }

    // stream a whole file as an IEnumerable (so very little memory needed)
    public static IEnumerable<f_results> FromFile(string dataFilePath) {
        var file = new FileStream(dataFilePath, FileMode.Open);
        var rdr = new BinaryReader(file);
        var eos = rdr.BaseStream.Length;
        while (rdr.BaseStream.Position < eos) yield return new f_results(rdr);
        rdr.Close();
        file.Close();
    }
}

class Program {
    static void Main(string[] args) {

        var d1 = new DirTree(@"C:\",
            new DirTree(@"C:\Dir1",
                new DirTree(@"C:\Dir1\Dir2"),
                new DirTree(@"C:\Dir1\Dir3")
                ),
                new DirTree(@"C:\Dir4",
                new DirTree(@"C:\Dir4\Dir5"),
                new DirTree(@"C:\Dir4\Dir6")
                ));

        var path = @"D:\Dirs.dir";

        // write the directory tree to a file
        var file = new FileStream(path, FileMode.CreateNew | FileMode.Truncate);
        var w = new BinaryWriter(file);
        d1.WriteTo(w);
        w.Close();
        file.Close();

        // read it from the file
        var file2 = new FileStream(path, FileMode.Open);
        var rdr = new BinaryReader(file2);
        var d2 = new DirTree(rdr);

        // now inspect d2 in debugger to see that it was read back into memory

        // find files bigger than (roughly) 1GB
        var BigFiles = from f in f_results.FromFile(@"C:\SomeFile.dat")
                       where f.size > 1e9
                       select f;
    }
}

class DirTree {
    public string Path { get; private set; }
    private string FilesFile { get { return Path.Replace(':', '_').Replace('\\', '_') + ".dat"; } }

    public IEnumerable<f_results> Files() {
        return f_results.FromFile(this.FilesFile);
    }

    // you'll want to encapsulate this in real code but I didn't for brevity
    public DirTree[] _SubDirectories;

    public DirTree(BinaryReader rdr) {
        Path = rdr.ReadString();
        int count = rdr.ReadInt32();
        _SubDirectories = new DirTree[count];
        for (int i = 0; i < count; i++) _SubDirectories[i] = new DirTree(rdr);
    }

    public DirTree( string Path, params DirTree[] subDirs){
        this.Path = Path;
        _SubDirectories = subDirs;
    }

    public void WriteTo(BinaryWriter w) {
        w.Write(Path);           
        w.Write(_SubDirectories.Length);
        // depth first is the easiest way to do this
        foreach (var f in _SubDirectories) f.WriteTo(w);
    }
}

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM