简体   繁体   English

Java集群,仅运行一次任务

[英]Java cluster, run task only once

We have a java process, which listen's to a directory X on the file system using apache commons vfs. 我们有一个Java进程,它使用apache commons vfs侦听文件系统上的目录X。 Whenever a new file is exported to this directory, our process kicks in. We first rename the file to filename.processing and parse the file name, get some information from the file and insert into tables, before sending this file to a Document management system. 每当将新文件导出到此目录时,我们的过程都会启动。我们首先将文件重命名为filename.processing并解析文件名,从文件中获取一些信息并将其插入表中,然后再将该文件发送到文档管理系统。 This is a single-threaded application per cluster. 这是每个群集的单线程应用程序。 Now consider this running in a cluster environment, we have 5 server's. 现在考虑在集群环境中运行,我们有5台服务器。 So 5 different VM's are trying to get access of the same file. 因此,有5个不同的VM正在尝试访问同一文件。 The whole implementation was on the basis that only one process can rename the file to .processing at a given time, as OS will not allow multiple processes modifying the file at the same time. 整个实现的基础是,在一个给定的时间,只有一个进程可以将文件重命名为.processing,因为OS不允许多个进程同时修改文件。 Once a cluster get's holds and renames file to .processing, other cluster's will ignore files which are of format .processing. 一旦一个集群被保留并重命名文件为.processing,其他集群将忽略格式为.processing的文件。

This was working fine since more than a year, but just now we found few duplicates. 自一年多以来,这种方法一直运行良好,但是现在我们发现很少重复。 It looks like multiple cluster's got hold of the file, in this case say cluster a, b, c have got access of the file f.pdf and they renamed it to f.pdf.processing at the same time,(i am still baffled how OS allows modifying the file at the same time). 似乎多个集群都拥有该文件,在这种情况下,说集群a,b,c可以访问文件f.pdf,并且他们同时将其重命名为f.pdf.processing,(我仍然感到困惑操作系统如何允许同时修改文件)。 As a result of these, cluster a,b,c they processed the file and send it to document management system. 由于这些结果,集群a,b,c处理了文件并将其发送到文档管理系统。 So now there are 3 duplicate files. 因此,现在有3个重复文件。

So in short what i am looking at is, approaches to run task only once in a cluster environment. 简而言之,我要看的是在集群环境中仅运行一次任务的方法。 I also want it to have a failover mechanism, so that if something went wrong with the cluster, another cluster picks up the task. 我还希望它具有故障转移机制,这样,如果群集出现问题,则另一个群集将接管任务。 We don't want to set env variable, like master=true on a box, as that will limit it to only one cluster and will not handle failover. 我们不想在一个盒子上设置env变量,例如master = true,因为那样会将它限制为仅一个集群,并且不会处理故障转移。

Any kind of help is appreciated. 任何帮助都将受到赞赏。

See the following post about file locking: How do filesystems handle concurrent read/write? 请参阅以下有关文件锁定的文章: 文件系统如何处理并发读/写?

Read and write operations (that includes renaming) of files are not atomic and not well-synchronized between processes, as you assumed - at least not so on most operating systems. 如您所假设的那样,文件的读写操作(包括重命名)不是原子的,并且在进程之间没有很好地同步-至少在大多数操作系统上不是这样。

However, creating a new file is usually an atomic operation. 但是,创建新文件通常是原子操作。 You can use that to your advantage. 您可以利用它来发挥自己的优势。 The concept is called whole-file-locking. 这个概念称为整个文件锁定。

We are implementing our own synchronization logic using a shared lock table inside the application database. 我们正在使用应用程序数据库内的共享锁表来实现自己的同步逻辑。 This allows all cluster nodes to check if a job is already running before actually starting it itself. 这允许所有群集节点在实际启动作业之前检查作业是否已经在运行。

Do you try to use FileLock tryLock() or lock(), before rename file to .processing? 在将文件重命名为.processing之前,您尝试使用FileLock tryLock()还是lock()? If you didn't, I think you should try, so in this case only one application can allowed to change this file. 如果没有,我认为您应该尝试一下,因此在这种情况下,仅允许一个应用程序更改此文件。

Update : Sorry, I forgot that you ask about VDF. 更新 :对不起,我忘记了您问有关VDF的问题。 In Apache VDF (in fact, in Apache Synapse ) I found VFSUtils class, that have following method: 在Apache VDF中(实际上在Apache Synapse中 ),我找到了VFSUtils类,该类具有以下方法:

public static boolean acquireLock(org.apache.commons.vfs2.FileSystemManager fsManager,
                                  org.apache.commons.vfs2.FileObject fo)

Acquires a file item lock before processing the item, guaranteing that the file is not processed while it is being uploaded and/or the item is not processed by two listeners
Parameters:
   fsManager - used to resolve the processing file
   fo - representing the processign file item
Returns:
   boolean true if the lock has been acquired or false if not

I think, that method can solve your problems (if you can use Apache Synapse in your project). 我认为,该方法可以解决您的问题(如果您可以在项目中使用Apache Synapse)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM