简体   繁体   中英

Is Recursion the Best Option for Determining the Max File Size in a Directory

I wrote the following method to determine the max file size:

    public static long GetMaxFileSize(string dirPath, long maxFileSize)
    {
        DirectoryInfo [] dirInfos = new DirectoryInfo(dirPath).GetDirectories();
        foreach (DirectoryInfo dirInfo in dirInfos)
        {
            DirectoryInfo [] subDirInfos = dirInfo.GetDirectories();
            foreach (DirectoryInfo subDirInfo in subDirInfos)
                maxFileSize = GetMaxFileSize(dirInfo.FullName, maxFileSize);

            FileInfo [] fileInfos = dirInfo.GetFiles();
            foreach (FileInfo fileInfo in fileInfos)
            {
                if (maxFileSize < fileInfo.Length)
                    maxFileSize = fileInfo.Length;
            }
        }

        return maxFileSize;
    }

Code Complete recommends to "use recursion selectively". That being the case, I was wondering if the community thought this was a valid use of recursion. If not, is there a better technique of doing this?

EDIT: I can't use LINQ because its not available in .NET 2.0, but I don't want to tag this as a .NET 2.0 question only to further discussion points like Jared's below.

EDIT: Cleaned up code based on an issue that was spotted in not getting the root directory's files.

   public static long GetMaxFileSize(DirectoryInfo dirInfo, long maxFileSize)
   {
       DirectoryInfo [] subDirInfos = dirInfo.GetDirectories();
       foreach (DirectoryInfo subDirInfo in subDirInfos)
       {
           maxFileSize = GetMaxFileSize(subDirInfo, maxFileSize);
       }

       FileInfo [] fileInfos = dirInfo.GetFiles();
       foreach (FileInfo fileInfo in fileInfos)
       {
           if (maxFileSize < fileInfo.Length)
               maxFileSize = fileInfo.Length;
       }

       return maxFileSize;
   }

I think a better way is to the File System API to do the searching for you via Directory.GetFiles. This method provides automatic searching of sub-directories. This eliminates the question of whether or not to recurse and instead leaves the decision of how to implement it on the designer of the API (who likely designed it for such a scenario).

This method combined with LINQ provides a very elegant solution

var max = Directory
  .GetFiles(path, "*", SearchOption.AllDirectories)
  .Select(x => new FileInfo(x))
  .Select(x => x.Length)
  .Max();

EDIT As Jimmy pointed out, for 4.0 and higher, it's better to use EnumerateFiles to avoid the overhead of creating a potentially large array

var max = Directory
  .EnumerateFiles(path, "*", SearchOption.AllDirectories)
  .Select(x => new FileInfo(x))
  .Select(x => x.Length)
  .Max();

As far as tree traversal goes, I think recursion is a fantastic fit. (The directory structure is a tree)

As long as you know that the directory structure isn't unbelievably huge, you shouldn't have to worry about overflowing the stack.

Recursive solutions to navigate trees are almost always more elegant than iterative solutions

It looks perfectly reasonable to me: with one exception, the depth your method will descend to is bounded by the depth of the file-system, which is guaranteed to be limited.

The exception is that if it runs on a file-system with symbolic links, you may follow a link to a directory which is an ancestor of the one you started at, and thereby enter an infinite loop, so you need to consider

  • Will your app be deployed on such a file-system (usually Unix-like, but I think Vista and Win7 have support)
  • Do you want to ignore symbolic links
  • Do your want to append an extra parameter to the method, which is a list of previously entered directories.

In your implementation however, you consistently ignore files in the current directory, therefore given a file-system

DirA
 | 
 +-DirB
 |  |
 |  +- DirC
 |  +- DirD
 |  |   |
 |  |   +-DirE
 |  |
 |  +- DirF
 |
 +- DirG

and the path to DirA, the only ones you consider are DirB, DirG and DirE

You need to do a GetFiles() on the current directory, then GetDirectories on the current directory and get the size of each of those. It may make sense to have your method take a DirInfo object by default, and overload it with a wrapper taking a String.

Recursion should be used only with certain data structures. A file system, being a tree structure, is definitely a good case for recursion. I would go so far as to say that this is probably the best way to accomplish what you're trying to do.

The recommendation on using "recursion selectively" is on on the stack size. On containers with large numbers you can overflow the stack and cause your code to crash.

The problem you will encounter will be with large directories containing 65536 or more folders.

I have found that Windows XP 32 bit crashes with 64k recursion calls.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM