简体   繁体   English

os.link() 与 os.rename() 与 os.replace() 用于编写原子写入文件。 最好的方法是什么?

[英]os.link() vs. os.rename() vs. os.replace() for writing atomic write files. What is the best approach?

Hi am trying to write an atomic write function like so...嗨,我正在尝试编写一个像这样的原子写入函数......

with tempfile.NamedTemporaryFile(mode= "w", dir= target_directory) as f: 
     #perform file writing operation  
     os.replace(f.name, target_file_name) 

I am struggling to figure out what would be the best action to do in line 3. Should I use os.replace(), os.rename() or should I create a hard link between tempfile and target file using os.link()?我正在努力弄清楚在第 3 行中执行的最佳操作是什么。我应该使用 os.replace()、os.rename() 还是应该使用 os.link() 在临时文件和目标文件之间创建硬链接?

Does os.link() use more memmory? os.link() 使用更多内存吗? What are the benefits of each and are all of them atomic?每个的好处是什么,它们都是原子的?

Hi am trying to write an atomic write function like so...嗨,我正在尝试编写一个原子写函数,如下所示:

with tempfile.NamedTemporaryFile(mode= "w", dir= target_directory) as f: 
     #perform file writing operation  
     os.replace(f.name, target_file_name) 

I am struggling to figure out what would be the best action to do in line 3. Should I use os.replace(), os.rename() or should I create a hard link between tempfile and target file using os.link()?我正在努力找出在第3行中最好的操作是什么?我应该使用os.replace(),os.rename()还是应该使用os.link()在tempfile和目标文件之间创建硬链接?

Does os.link() use more memmory? os.link()是否使用更多的内存? What are the benefits of each and are all of them atomic?每个都有什么好处,并且都是原子性的?

Hi am trying to write an atomic write function like so...嗨,我正在尝试编写一个原子写函数,如下所示:

with tempfile.NamedTemporaryFile(mode= "w", dir= target_directory) as f: 
     #perform file writing operation  
     os.replace(f.name, target_file_name) 

I am struggling to figure out what would be the best action to do in line 3. Should I use os.replace(), os.rename() or should I create a hard link between tempfile and target file using os.link()?我正在努力找出在第3行中最好的操作是什么?我应该使用os.replace(),os.rename()还是应该使用os.link()在tempfile和目标文件之间创建硬链接?

Does os.link() use more memmory? os.link()是否使用更多的内存? What are the benefits of each and are all of them atomic?每个都有什么好处,并且都是原子性的?

First, os.link creates hard link which means that the src-link shares the same space with the dst-link.首先, os.link创建硬链接,这意味着 src-link 与 dst-link 共享相同的空间。 No copy-write operations are performed - thus no memory overhead.不执行复制-写入操作 - 因此没有内存开销。 Sadly, hard links are not compatible with all file systems.遗憾的是,硬链接并非与所有文件系统兼容。 For instance, NTFS supports hard links, while FAT and ReFS do not.例如,NTFS 支持硬链接,而 FAT 和 ReFS 不支持。

Secondly os.replace should be preferred to os.rename as it's more crossplatform (only for Python 3.3+).其次os.replace应该优先于os.rename因为它更平台(仅适用于 Python 3.3+)。 Another important thing is that source and destination paths must be on the same physical disk.另一个重要的事情是源路径和目标路径必须在同一个物理磁盘上。 As for atomicity, there seems to be a very high probability (but not 100%) that os.replace is atomic in all possible cases on Unix\\Windows.至于原子性,在 Unix\\Windows 上的所有可能情况下, os.replace似乎很有可能(但不是 100%)是原子的。 Related links 1 , 2 , 3 .相关链接123 In any case, this is the recommended approach to avoid race conditions/TOCTOU-bags .无论如何,这是避免竞争条件/TOCTOU-bags的推荐方法 As for me, I have never encountered or been able to reproduce a situation where a calling of os.replace ended up with src or dst data corruption.至于我,我从未遇到或无法重现调用os.replace最终导致 src 或 dst 数据损坏的情况。 However, as long as such behavior is not a requirement in official documents, os.replace should not be considered an atomic call (especially for Windows)但是,只要官方文档中没有要求这种行为, os.replace 就不应该被视为原子调用(尤其是对于 Windows)

Your example code is definitely not atomic by definition - at any moment any related process can break the data integrity of your temporary file;根据定义,您的示例代码绝对不是原子的 - 在任何时候,任何相关进程都可能破坏临时文件的数据完整性; abruptly closing the execution process on non-windows systems can even leave your temp-file in the specified directory forever.在非 Windows 系统上突然关闭执行过程甚至可以将您的临时文件永远留在指定目录中。 To solve these problems, you may need some synchronization primitives, locks, while the logic of your code must assume the most improbable cases of interruptions or corruptions of your data.为了解决这些问题,您可能需要一些同步原语、锁,而您的代码逻辑必须假设最不可能发生的数据中断或损坏情况。

Here is an example of a common case when some data should be extracted from an existing file or otherwise created in such a file:这是一个常见情况的示例,当某些数据应从现有文件中提取或以其他方式在此类文件中创建时:

import time
filename = 'data' # file with data
temp_filename = 'data.temp' # temp used to create 'data'

def get_data():
    while True:
        try:
            os.remove(temp_filename)
        except FileNotFoundError: # if no data.temp
            try: # check if data already exists:
                with open(filename, 'rt', encoding='utf8') as f:
                    return f.read() # return data here
            except FileNotFoundError:
                pass # create data
        except PermissionError: # if another process/thread is creating data.temp right now
            time.sleep(0.1) # wait for it
            continue
        else:
            pass # something went wrong and it's better to create all again

        # data creation:
        excl_access = 'xt' # raises error if file exists - used as a primitive lock
        try:
            with open(temp_filename, excl_access, encoding='utf8') as f:
                # process can be interrupted here 
                f.write('Hello ') # or here
                f.write('world!') # or here
        except FileExistsError: # another one is creating it now
            time.sleep(0.1) # wait for it
            continue
        except Exception: # something went wrong
            continue
        try:
            os.replace(temp_filename, filename) # not sure this would be atomic in 100%
        except FileNotFoundError:
            continue # try again

Here's a related question with some answers that recommend some external libs to handle atomic file creation 这是一个相关的问题,其中有一些答案建议使用一些外部库来处理原子文件创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM