简体   繁体   中英

Avoid detecting incomplete files when watching a directory for changes in java

I am watching a directory for incoming files (using FileAlterationObserver from apache commons).

class Example implements FileAlterationListener {
    public void prepare() {
        File directory = new File("/tmp/incoming");
        FileAlterationObserver observer = new FileAlterationObserver(directory);
        observer.addListener(this);
        FileAlterationMonitor monitor = new FileAlterationMonitor(10);
        monitor.addObserver(observer);
        monitor.start();
        // ...
    }

    public void handleFile(File f) {
        // FIXME: this should be called when the writes that 
        // created the file have completed, not before
    }

    public void onFileCreate(File f) {
        handleFile(f);
    }

    public void onFileChange(File f) {
        handleFile(f);
    }
}

The files are written in place by processes that I have no control over.

The problem I have with that code is that my callback is triggered when the File is initially created. I need it to trigger when the file has been changed and the write to the file has completed. (maybe by detecting when the file stopped changing)

What's the best way to do that?

I had a similar problem. At first I thought I could use the FileWatcher service, but it doesn't work on remote volumes, and I had to monitor incoming files via a network mounted drive.

Then I thought I could simply monitor the change in file size over a period of time and consider the file done once the file size had stabilized (as fmucar suggested). But I found that in some instances on large files, the hosting system would report the full size of the file it was copying, rather than the number of bytes it had written to disk. This of course made the file appear stable, and my detector would catch the file while it was still in the process of being written.

I eventually was able to get the monitor to work, by employing a FileInputStream exception, which worked wonderfully in detecting whether a file was being written to, even when the file was on a network mounted drive.

      long oldSize = 0L;
      long newSize = 1L;
      boolean fileIsOpen = true;

      while((newSize > oldSize) || fileIsOpen){
          oldSize = this.thread_currentFile.length();
          try {
            Thread.sleep(2000);
          } catch (InterruptedException e) {
            e.printStackTrace();
          }
          newSize = this.thread_currentFile.length();

          try{
              new FileInputStream(this.thread_currentFile);
              fileIsOpen = false;
          }catch(Exception e){}
      }

      System.out.println("New file: " + this.thread_currentFile.toString());

A generic solution to this problem seems impossible from the "consumer" end. The "producer" may temporarily close the file and then resume appending to it. Or the "producer" may crash, leaving an incomplete file in the file system.

A reasonable pattern is to have the "producer" write to a temp file that's not monitored by the "consumer". When it's done writing, rename the file to something that's actually monitored by the "consumer", at which point the "consumer" will pick up the complete file.

I don't think you can achieve what you want unless you have some file system constraints and guarantees. For example, what if you have the following scenario :

  • File X created
  • A bunch of change events are triggered that correspond with writing out of file X
  • A lot of time passes with no updates to file X
  • File X is updated.

If file X cannot be updated after it's written out, you can have a thread of execution that calculates the elapsed time from the last update to now, and after some interval decides that the file write is complete. But even this has issues. If the file system is hung, and the write does not occur for some time, you could erroneously conclude that the file is finished writing out.

您可以在几秒钟内检查文件大小2次或更多次,如果大小没有变化,则可以确定文件更改已完成并继续执行。

If you use FileAlterationListener and add a FileAlterationListenerAdaptor you can implement the methods you need and monitor the files with a FileAlterationMonitor ...

public static void main( String[] args ) throws Exception {

    FileAlterationObserver fao = new FileAlterationObserver( dir );
    final long interval = 500;
    FileAlterationMonitor monitor = new FileAlterationMonitor( interval );
    FileAlterationListener listener = new FileAlterationListenerAdaptor() {

        @Override
        public void onFileCreate( File file ) {
            try {
                System.out.println( "File created: " + file.getCanonicalPath() );
            } catch( IOException e ) {
                e.printStackTrace( System.err );
            }
        }

        @Override
        public void onFileDelete( File file ) {
            try {
                System.out.println( "File removed: " + file.getCanonicalPath() );
            } catch( IOException e ) {
                e.printStackTrace( System.err );
            }
        }

        @Override
        public void onFileChange( File file ) {
            try {
                System.out.println( file.getName() + " changed: ");
            } catch( Exception e ) {
                e.printStackTrace();
            } 
        }
    };
    // Add listeners...
    fao.addListener( listener );
    monitor.addObserver( fao );
    monitor.start();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM