简体   繁体   中英

mkdirs() function in multithreaded environment

I am creating tree of files and folder. I am rewriting to multithreading. The only weak point I see is when creating folders. Right now it goes one by one (in depth). Before I write down the file I check if the path exists. If not, I use mkdirs to create all that is missing.

public void checkDir(String relativePath) {
        File file = new File(homePath + relativePath);
        if (!file.exists()) {
            if (file.mkdirs()) {
                log.info("Directory: " + homePath + relativePath + " is created!");
            } else {
                log.error("Failed to create directory: " + homePath + relativePath + " !");
            }
        }
    }

I have a question what happens when I use two threads. One has path A/B/C and the other one A/B/D. Let's say I have only A folder existing but not B. So both of them will check that the path doesn't exist and want to create it. So one of them will probably fail because the other one will be faster. So how can I manage this?

  1. I was thinking about removing the exists condition and leave it fail but there is no AlreadyExists exception I could catch..
  2. Create directory tree first (but I think there will be better way?)
  3. Put directory creation as critical section and make it sequential - not sure how to do this in spring, but anyway hot sure it is neccessary and won't slow down the process too much.

Maybe I am overthinking too much, but theoretically such situation might happen. Currently I use regular Thread, but I want to use spring TaskExecutor for this. It handles critical sections on its own, but this is not shared variable or anything and the path is different so I think it won't recognize it.

Thanks for suggestions.

The File.mkdirs() method is specified to create the directory, and all its parents, if they don't exist. Ergo there's no point in calling exists(). Existence will get checked anyway. Calling exists() is just wasting time. mkdirs() is essentially an atomic operation: there is really no point in trying to out-think it.

Note that a return value of false isn't necessarily a failure. It might just indicate that all the directories in the path already existed.

Basically the premiss of your question is false.

None of the answers seem to address the issue of whether mkdirs() is thread safe, one answer states that mkdirs() is atomic but there might exist cases where this fails. This function essentially deals with the file system so it probably involves a system call to the operating system on the respective host, and determining whether those system calls are actually thread safe could be impossible if you don't already know the target system your application will be used on.

For example, even though mkdirs() checks for existence before creating the folder structure what would happen in the following case,

Thread 1 calls mkdirs(), it inherently checks for the existence of the folder structure and determines it doesn't exist. At that time, Thread 1 gets pre-empted.

Thread 2 calls mkdirs(), it inherently checks for the existence of the folder structure and determines that it doesn't exist and subsequently moves on to create the folder structure.

Thread 1 starts up again and contines to try creating the folder structure with the previous determination that it didn't exist before.

What happens there? I don't know, this sequence of events would be hard to test and especially with the knowledge that the create folder system call varies between operating systems. Your best bet for thread safety and to avoid introducing errors that would be potentially difficult to track and debug would be to implement a degree of mutual exclusion at this critical section in the code.

I guess it would be easy to take a naive approach and declare a single 'global' variable that both threads would have access to, for example a Boolean b and then add the following code around your critical section,

synchronized(b) {
     // Your critical section here
}

This would guarantee that if one thread has locked b it alone will only get to access the critical section while the other one waits, thus making sure that mkdir() won't get called by both threads.

However, if you want to learn more about multi-threading and how to implement mutual exclusion on a lower level, in this case I'd suggest you look at semaphores and how they could be implemented to solve this.

As EJP points out a return of false can mean many things, some errors, some not. If you want to log the fact it was actually unable to create the directory, you should check the existence afterwards:

public final class DirectoryHelper {
   private DirectoryHelper(){}

   public static boolean createDirectories(File path) {
      if (path.mkdirs()) return true; //definitely has new dir
      // if false, just look afterwards for the existence of the directory
      // also opportunity to throw own exceptions if you prefer
      return path.exists() && path.isDirectory();
   }
}

I've written here a new method that returns false only if the directory is not there afterwards. I don't care if it's just been made or existed already. Due to the new order I don't need a synchronized block either.

Your code then looks like:

public void checkDir(String relativePath) {
    File file = new File(new File(homePath), relativePath);
    if (!file.exists()) { // just to prevent logging of existing dirs
        if (DirectoryHelper.createDirectories(file)) {
            log.info("Directory: " + file + " is created!");
        } else {
            log.error("Failed to create directory: " + file + " !");
        }
    }
};

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM