简体   繁体   English

最快/最安全的文件查找/解析?

[英]Fastest/safest file finding/parsing?

On c: , I have tens of thousands of *.foobar files. c: ,我有成千上万的*.foobar文件。 They're in all sorts of places (ie subdirs). 他们在各种各样的地方(即子市场)。 These files are roughly 1 - 64 kb in size, and plaintext. 这些文件大小约为1 64 kb,并且是纯文本。

I have a class Foobar(string fileContents) that strongly types these .foobar files. 我有一个class Foobar(string fileContents)强烈键入这些.foobar文件。

My challenge to is get a list of all the *.foobar files on c: , represented as an array of Foobar objects. 我的挑战是获取c:上所有*.foobar文件的列表,表示为Foobar对象的数组。 What's the quickest way to do this? 最快的方法是什么?

I'm interested to find out if there's a better way (undoubtedly) than my first approach, which follows, and if this approach of mine has any potential problems (eg I/O concurrency issues throwing exceptions?): 我很想知道是否有一种比我的第一种方法更好的方式(毫无疑问),如果我的这种方法有任何潜在的问题(例如I / O并发问题抛出异常?):

var files = Directory.EnumerateFiles
                (rootPath, "*.foobar", SearchOption.AllDirectories);

Foobar[] foobars = 
(
    from filePath in files.AsParallel()
    let contents = File.ReadAllText(filePath)
    select new Foobar(contents)
)
.ToArray();

Because permission errors (or other errors) can apparently stop the enumeration dead in its tracks, you may want to implement your own enumerator something like this: 因为权限错误(或其他错误)显然可以阻止枚举在其轨道中死亡,您可能希望实现自己的枚举器,如下所示:

class SafeFileEnumerator : IEnumerable<string>
{
  private string root;
  private string pattern;
  private IList<Exception> errors;
  public SafeFileEnumerator(string root, string pattern)
  {
     this.root = root;
     this.pattern = pattern;
     this.errors = new List<Exception>();
  }

  public SafeFileEnumerator(string root, string pattern, IList<Exception> errors)
  {
     this.root = root;
     this.pattern = pattern;
     this.errors = errors;
  }

  public Exception[] Errors()
  {
     return errors.ToArray();
  }
  class Enumerator : IEnumerator<string>
  {
     IEnumerator<string> fileEnumerator;
     IEnumerator<string> directoryEnumerator;
     string root;
     string pattern;
     IList<Exception> errors;

     public Enumerator(string root, string pattern, IList<Exception> errors)
     {
        this.root = root;
        this.pattern = pattern;
        this.errors = errors;
        fileEnumerator = System.IO.Directory.EnumerateFiles(root, pattern).GetEnumerator();
        directoryEnumerator = System.IO.Directory.EnumerateDirectories(root).GetEnumerator();
     }
     public string Current
     {
        get
        {
           if (fileEnumerator == null) throw new ObjectDisposedException("FileEnumerator");
           return fileEnumerator.Current;
        }
     }

     public void Dispose()
     {
        if (fileEnumerator != null)
           fileEnumerator.Dispose();
        fileEnumerator = null;
        if (directoryEnumerator != null)
           directoryEnumerator.Dispose();
        directoryEnumerator = null;
     }

     object System.Collections.IEnumerator.Current
     {
        get { return Current; }
     }

     public bool MoveNext()
     {
        if ((fileEnumerator != null) && (fileEnumerator.MoveNext()))
           return true;
        while ((directoryEnumerator != null) && (directoryEnumerator.MoveNext()))
        {
           if (fileEnumerator != null)
              fileEnumerator.Dispose();
           try
           {
              fileEnumerator = new SafeFileEnumerator(directoryEnumerator.Current, pattern, errors).GetEnumerator();
           }
           catch (Exception ex)
           {
              errors.Add(ex);
              continue;
           }
           if (fileEnumerator.MoveNext())
              return true;
        }
        if (fileEnumerator != null)
           fileEnumerator.Dispose();
        fileEnumerator = null;
        if (directoryEnumerator != null)
           directoryEnumerator.Dispose();
        directoryEnumerator = null;
        return false;
     }

     public void Reset()
     {
        Dispose();
        fileEnumerator = System.IO.Directory.EnumerateFiles(root, pattern).GetEnumerator();
        directoryEnumerator = System.IO.Directory.EnumerateDirectories(root).GetEnumerator();
     }
  }
  public IEnumerator<string> GetEnumerator()
  {
     return new Enumerator(root, pattern, errors);
  }

  System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
  {
     return GetEnumerator();
  }
}

Great work, here is an extension to your code to return FileSystemInfo's instead of string paths. 很棒的工作,这里是代码的扩展,返回FileSystemInfo而不是字符串路径。 Some minor changes in line, like adding in SearchOption (like the native .net one has), and error trapping on initial directory get in case the root folder is access denied. 行中的一些小改动,比如添加SearchOption(就像本机.net一样),以及初始目录上的错误捕获,以防根文件夹被拒绝访问。 Thanks again for the original posting! 再次感谢原帖!

public class SafeFileEnumerator : IEnumerable<FileSystemInfo>
{
    /// <summary>
    /// Starting directory to search from
    /// </summary>
    private DirectoryInfo root;

    /// <summary>
    /// Filter pattern
    /// </summary>
    private string pattern;

    /// <summary>
    /// Indicator if search is recursive or not
    /// </summary>
    private SearchOption searchOption;

    /// <summary>
    /// Any errors captured
    /// </summary>
    private IList<Exception> errors;

    /// <summary>
    /// Create an Enumerator that will scan the file system, skipping directories where access is denied
    /// </summary>
    /// <param name="root">Starting Directory</param>
    /// <param name="pattern">Filter pattern</param>
    /// <param name="option">Recursive or not</param>
    public SafeFileEnumerator(string root, string pattern, SearchOption option)
        : this(new DirectoryInfo(root), pattern, option)
    {}

    /// <summary>
    /// Create an Enumerator that will scan the file system, skipping directories where access is denied
    /// </summary>
    /// <param name="root">Starting Directory</param>
    /// <param name="pattern">Filter pattern</param>
    /// <param name="option">Recursive or not</param>
    public SafeFileEnumerator(DirectoryInfo root, string pattern, SearchOption option)
        : this(root, pattern, option, new List<Exception>()) 
    {}

    // Internal constructor for recursive itterator
    private SafeFileEnumerator(DirectoryInfo root, string pattern, SearchOption option, IList<Exception> errors)
    {
        if (root == null || !root.Exists)
        {
            throw new ArgumentException("Root directory is not set or does not exist.", "root");
        }
        this.root = root;
        this.searchOption = option;
        this.pattern = String.IsNullOrEmpty(pattern)
            ? "*"
            : pattern;
        this.errors = errors;
    }

    /// <summary>
    /// Errors captured while parsing the file system.
    /// </summary>
    public Exception[] Errors
    {
        get
        {
            return errors.ToArray();
        }
    }

    /// <summary>
    /// Helper class to enumerate the file system.
    /// </summary>
    private class Enumerator : IEnumerator<FileSystemInfo>
    {
        // Core enumerator that we will be walking though
        private IEnumerator<FileSystemInfo> fileEnumerator;
        // Directory enumerator to capture access errors
        private IEnumerator<DirectoryInfo> directoryEnumerator;

        private DirectoryInfo root;
        private string pattern;
        private SearchOption searchOption;
        private IList<Exception> errors;

        public Enumerator(DirectoryInfo root, string pattern, SearchOption option, IList<Exception> errors)
        {
            this.root = root;
            this.pattern = pattern;
            this.errors = errors;
            this.searchOption = option;

            Reset();
        }

        /// <summary>
        /// Current item the primary itterator is pointing to
        /// </summary>
        public FileSystemInfo Current
        {
            get
            {
                //if (fileEnumerator == null) throw new ObjectDisposedException("FileEnumerator");
                return fileEnumerator.Current as FileSystemInfo;
            }
        }

        object System.Collections.IEnumerator.Current
        {
            get { return Current; }
        }

        public void Dispose()
        {
            Dispose(true, true);
        }

        private void Dispose(bool file, bool dir)
        {
            if (file)
            {
                if (fileEnumerator != null)
                    fileEnumerator.Dispose();

                fileEnumerator = null;
            }

            if (dir)
            {
                if (directoryEnumerator != null)
                    directoryEnumerator.Dispose();

                directoryEnumerator = null;
            }
        }

        public bool MoveNext()
        {
            // Enumerate the files in the current folder
            if ((fileEnumerator != null) && (fileEnumerator.MoveNext()))
                return true;

            // Don't go recursive...
            if (searchOption == SearchOption.TopDirectoryOnly) { return false; }

            while ((directoryEnumerator != null) && (directoryEnumerator.MoveNext()))
            {
                Dispose(true, false);

                try
                {
                    fileEnumerator = new SafeFileEnumerator(
                        directoryEnumerator.Current,
                        pattern,
                        SearchOption.AllDirectories,
                        errors
                        ).GetEnumerator();
                }
                catch (Exception ex)
                {
                    errors.Add(ex);
                    continue;
                }

                // Open up the current folder file enumerator
                if (fileEnumerator.MoveNext())
                    return true;
            }

            Dispose(true, true);

            return false;
        }

        public void Reset()
        {
            Dispose(true,true);

            // Safely get the enumerators, including in the case where the root is not accessable
            if (root != null)
            {
                try
                {
                    fileEnumerator = root.GetFileSystemInfos(pattern, SearchOption.TopDirectoryOnly).AsEnumerable<FileSystemInfo>().GetEnumerator();
                }
                catch (Exception ex)
                {
                    errors.Add(ex);
                    fileEnumerator = null;
                }

                try
                {
                    directoryEnumerator = root.GetDirectories(pattern, SearchOption.TopDirectoryOnly).AsEnumerable<DirectoryInfo>().GetEnumerator();
                }
                catch (Exception ex)
                {
                    errors.Add(ex);
                    directoryEnumerator = null;
                }
            }
        }
    }
    public IEnumerator<FileSystemInfo> GetEnumerator()
    {
        return new Enumerator(root, pattern, searchOption, errors);
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM