简体   繁体   English

pathlib 中的 Python 3 Path.write_text 是原子的吗?

[英]Is Python 3 Path.write_text from pathlib atomic?

I was wondering if the Path.write_text(data) function from pathlib was atomic or not.我想知道 pathlib 中的Path.write_text(data) function 是否是原子的。

If not, are there scenarios where we could end up with a file created in the filesystem but not containing the intended content?如果不是,是否存在我们最终会在文件系统中创建一个文件但不包含预期内容的情况?

To be more specific, as the comment from @ShadowRanger suggested what I care about is to know if the file contains either the original data or the new data, but never something in between.更具体地说,正如@ShadowRanger 的评论所暗示的那样,我关心的是知道文件是否包含原始数据或新数据,而不是介于两者之间。 Which is actually less as full atomicity.这实际上不如完全原子性。

On the specific case of the file containing the original data or the new data, and nothing in between:关于包含原始数据或新数据的文件的具体情况,以及介于两者之间的任何内容:

No, it does not do any tricks with opening a temp file in the same directory, populating it, and finishing with an atomic rename to replace the original file.不,它不会在同一目录中打开一个临时文件,填充它,并以原子重命名完成以替换原始文件。 The current implementation is guaranteed to be at least two unique operations:当前的实现保证至少有两个独特的操作:

  1. Opening the file in write mode (which implicitly truncates it), and以写入模式打开文件(隐式截断它),以及
  2. Writing out the provided data (which may take multiple system calls depending on the size of the data, OS API limitations, and interference by signals that might interrupt the write part-way and require the remainder to be written in a separate system call)写出提供的数据(这可能需要多次系统调用,具体取决于数据的大小、操作系统 API 的限制以及信号的干扰,这些信号可能会中途中断写入并需要将其余部分写入单独的系统调用)

If nothing else, your code could die after step 1 and before step 2 (a badly timed Ctrl-C or power loss), and the original data would be gone, and no new data would be written.如果不出意外,您的代码可能会在第 1 步之后和第 2 步之前死掉(Ctrl-C 的时机不当或断电),并且原始数据将消失,并且不会写入任何新数据。

Old answer in terms of general atomicity:关于一般原子性的旧答案:

The question is kinda nonsensical on its face.从表面上看,这个问题有点荒谬。 It doesn't really matter if it's atomic;它是否是原子的并不重要 even if it was atomic, a nanosecond after the write occurs, some other process could open the file, truncate it, rewrite it, move it, etc. Heck, in between write_text opening the file and when it writes the data, some other process could swoop in and move/rename the newly opened file or delete it;即使它是原子的,写入发生后的纳秒,其他一些进程也可以打开文件,截断它,重写它,移动它等等。哎呀,在write_text打开文件和写入数据之间,其他一些进程可以突入并移动/重命名新打开的文件或删除它; the open handle write_text holds would still work when it writes a nanosecond later, but the data would never be seen in a file at the provided path (and might disappear the instant write_text closes it, if some other process swooped in and deleted it).打开的句柄write_text holds 在它稍后写入纳秒时仍然有效,但数据永远不会在提供的路径的文件中看到(并且可能会在write_text关闭它的瞬间消失,如果其他进程突然进入并删除它)。

Beyond that, it can't be atomic even while writing, in any portable sense.除此之外,即使任何可移植的意义上,它也不能是原子的。 Two processes could have the same file open at once, and their writes can interleave (there are locks around the standard handles within a process to prevent this, but no such locks exist to coordinate with an arbitrary other process).两个进程可以同时打开同一个文件,并且它们的写入可以交错(进程的标准句柄周围有锁以防止这种情况发生,但不存在与任意其他进程协调的此类锁)。 Concurrent file I/O is hard;并发文件 I/O 很难; avoid it if at all possible.尽可能避免使用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM