简体   繁体   中英

os.link() vs. os.rename() vs. os.replace() for writing atomic write files. What is the best approach?

Hi am trying to write an atomic write function like so...

with tempfile.NamedTemporaryFile(mode= "w", dir= target_directory) as f: 
     #perform file writing operation  
     os.replace(f.name, target_file_name) 

I am struggling to figure out what would be the best action to do in line 3. Should I use os.replace(), os.rename() or should I create a hard link between tempfile and target file using os.link()?

Does os.link() use more memmory? What are the benefits of each and are all of them atomic?

Hi am trying to write an atomic write function like so...

with tempfile.NamedTemporaryFile(mode= "w", dir= target_directory) as f: 
     #perform file writing operation  
     os.replace(f.name, target_file_name) 

I am struggling to figure out what would be the best action to do in line 3. Should I use os.replace(), os.rename() or should I create a hard link between tempfile and target file using os.link()?

Does os.link() use more memmory? What are the benefits of each and are all of them atomic?

Hi am trying to write an atomic write function like so...

with tempfile.NamedTemporaryFile(mode= "w", dir= target_directory) as f: 
     #perform file writing operation  
     os.replace(f.name, target_file_name) 

I am struggling to figure out what would be the best action to do in line 3. Should I use os.replace(), os.rename() or should I create a hard link between tempfile and target file using os.link()?

Does os.link() use more memmory? What are the benefits of each and are all of them atomic?

First, os.link creates hard link which means that the src-link shares the same space with the dst-link. No copy-write operations are performed - thus no memory overhead. Sadly, hard links are not compatible with all file systems. For instance, NTFS supports hard links, while FAT and ReFS do not.

Secondly os.replace should be preferred to os.rename as it's more crossplatform (only for Python 3.3+). Another important thing is that source and destination paths must be on the same physical disk. As for atomicity, there seems to be a very high probability (but not 100%) that os.replace is atomic in all possible cases on Unix\\Windows. Related links 1 , 2 , 3 . In any case, this is the recommended approach to avoid race conditions/TOCTOU-bags . As for me, I have never encountered or been able to reproduce a situation where a calling of os.replace ended up with src or dst data corruption. However, as long as such behavior is not a requirement in official documents, os.replace should not be considered an atomic call (especially for Windows)

Your example code is definitely not atomic by definition - at any moment any related process can break the data integrity of your temporary file; abruptly closing the execution process on non-windows systems can even leave your temp-file in the specified directory forever. To solve these problems, you may need some synchronization primitives, locks, while the logic of your code must assume the most improbable cases of interruptions or corruptions of your data.

Here is an example of a common case when some data should be extracted from an existing file or otherwise created in such a file:

import time
filename = 'data' # file with data
temp_filename = 'data.temp' # temp used to create 'data'

def get_data():
    while True:
        try:
            os.remove(temp_filename)
        except FileNotFoundError: # if no data.temp
            try: # check if data already exists:
                with open(filename, 'rt', encoding='utf8') as f:
                    return f.read() # return data here
            except FileNotFoundError:
                pass # create data
        except PermissionError: # if another process/thread is creating data.temp right now
            time.sleep(0.1) # wait for it
            continue
        else:
            pass # something went wrong and it's better to create all again

        # data creation:
        excl_access = 'xt' # raises error if file exists - used as a primitive lock
        try:
            with open(temp_filename, excl_access, encoding='utf8') as f:
                # process can be interrupted here 
                f.write('Hello ') # or here
                f.write('world!') # or here
        except FileExistsError: # another one is creating it now
            time.sleep(0.1) # wait for it
            continue
        except Exception: # something went wrong
            continue
        try:
            os.replace(temp_filename, filename) # not sure this would be atomic in 100%
        except FileNotFoundError:
            continue # try again

Here's a related question with some answers that recommend some external libs to handle atomic file creation

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM