简体   繁体   English

在AWS EFS上使用flock来模拟关键部分是否安全?

[英]Is it safe to use flock on AWS EFS to emulate a critical section?

According to the docs, AWS EFS (Amazon Elastic File System) supports file locking: 根据文档, AWS EFS(Amazon Elastic File System)支持文件锁定:

Amazon EFS provides a file system interface and file system access semantics (such as strong data consistency and file locking). Amazon EFS提供文件系统接口和文件系统访问语义(例如强数据一致性和文件锁定)。

On a local file system (eg, ext4), flock can be used in shell scripts to create a critical section . 在本地文件系统(例如,ext4)上,可以在shell脚本中使用flock来创建关键部分 For example, this answer describe a pattern that I used in the past: 例如, 这个答案描述了我过去使用的模式:

#!/bin/bash
(
  # Wait for lock on /var/lock/.myscript.exclusivelock (fd 200) for 10 seconds
  flock -x -w 10 200 || exit 1

  # Do stuff

) 200>/var/lock/.myscript.exclusivelock

Can the same pattern be applied on EFS? 可以在EFS上应用相同的模式吗? Amazon mentions that they are using the NFSv4 protocol, but does it provide the same guarantees as flock on ext4? 亚马逊提到他们正在使用NFSv4协议,但它是否提供与ext4上的flock相同的保证?

If not, how can you enforce that an operation runs exclusively across all EC2 instances that are attached to the same EFS volume? 如果不是,您如何强制执行操作仅在连接到同一EFS卷的所有EC2实例上运行? It is sufficient if it works for processes, as I'm not planning to run multiple threads. 如果它适用于进程就足够了,因为我不打算运行多个线程。

Or did I misunderstood the locking support provided in NFSv4? 或者我误解了NFSv4中提供的锁定支持? Unfortunately, I don't know the details of the protocol, but providing atomicity in a distributed system is a much harder problem than on a local machine. 不幸的是,我不知道协议的细节,但是在分布式系统中提供原子性比在本地机器上要困难得多。

Update: small scale experiment 更新:小规模实验

Not a proof, of course, but in my tests it works across multiple instances. 当然不是证据,但在我的测试中,它适用于多个实例。 For now, I assume the pattern is safe to use. 目前,我认为该模式可以安全使用。 Still, would be nice to know if it is theoretically sound. 不过,如果它在理论上是合理的,那将会很高兴。

It should work. 它应该工作。

The flock command as used in the pattern in the question should work on all NFS file systems. 问题模式中使用的flock命令应该适用于所有NFS文件系统。 That means, it will also work on EFS, which implements the NFSv4 protocol. 这意味着,它也适用于实现NFSv4协议的EFS。 In practice, I also did not encounter any problems so far when using it to synchronize shell scripts on different EC2 instances. 实际上,到目前为止,在使用它来同步不同EC2实例上的shell脚本时,我也没有遇到任何问题。


Depending on your use case, you have to aware of the gotchas of file locking on Linux , although most of it is not NFS specific. 根据您的使用情况,您必须了解Linux上的文件锁定问题 ,尽管其中大部分都不是NFS特定的。 For instance, the pattern above operates on the process level, and cannot be used if want to synchronize multiple threads. 例如,上面的模式在进程级别上运行,如果想要同步多个线程则不能使用。

While reading, I came across old issues. 在阅读时,我遇到了旧问题。 In kernels prior to 2.6.12, there seemed to be problems with NFS and the flock system call (eg, see flock vs lockf on Linux ). 在2.6.12之前的内核中,似乎存在NFS和flock系统调用的问题(例如,请参阅Linux上的flock vs lockf )。

It should not apply here, as it has been improved in newer kernels. 它不应该适用于此,因为它已在新内核中得到改进。 Looking the source code of the flock command, you can confirm that it still uses the flock system call, but it could be potentially implemented by the safe fcntl system call: 查看flock命令的源代码 ,您可以确认它仍然使用flock系统调用,但它可能由安全的fcntl系统调用实现:

while (flock(fd, type | block)) {
  ...
  case EBADF:       /* since Linux 3.4 (commit 55725513) */
        /* Probably NFSv4 where flock() is emulated by fcntl().
         * Let's try to reopen in read-write mode.
         */

Note: the workaround refers to this commit in the Linux kernel can be found: 注意:解决方法是指在Linux内核中可以找到此提交

Since we may be simulating flock() locks using NFS byte range locks, we can't rely on the VFS having checked the file open mode for us. 由于我们可能使用NFS字节范围锁来模拟flock()锁,因此我们不能依赖VFS为我们检查文件打开模式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM