简体   繁体   中英

How to write and update .txt files with python?

I ve written a script that fetches bitcoin data and saves it in .txt files or in the case where the .txt files exist, it updates them. The .txt files are nodes and relationships connecting the nodes for neo4j.

At the beginning of the script:

  1. It checks whether the files exist, so it opens them and appends new lines OR
  2. In case the files do not exist, the script creates them and starts appending lines.

The .txt files are constantly open, the script writes the new data. The .txt files close when all the data are written or I terminate the execution.

My question is:

Should I open, write, close each .txt file for each iteration and for each .txt file?

or

Should I keep it the way it is now; open the .txt files, do all the writing, when the writing is done close the .txt file

I am saving data from 6013 blocks. Which way would minimize risk of corrupting the data written in the .txt files?

Keeping files open will be faster. In the comments you mentioned that "Loss of data previously written is not an option". The probability of corrupting files is higher for open files so open and close file on each iteration is more reliable. There is also an option to keep data in some buffer and to write/append buffer to file when all data is received or on user/system interrupt or network timeout.

I think keeping the file open will be more efficient, because python won't need to search for the file and open it every time you want to read/write the file.

I guess it should look like this

with open(filename, "a") as file:
    while True:
        data = # get data
        file.write(data)

Run a benchmark and see for yourself would the typical answer for this kind of question.

Nevertheless opening and closing a file does have a cost. Python needs to allocate memory for the buffer and data structures associated with the file and call some operating system functions, eg the open syscall which in turn would search the file in cache or on disk.

On the other hand there is a limit on the number of files a program, the user, the whole system, etc can open at the same time. For example on Linux, the value in /proc/sys/fs/file-max denotes the maximum number of file-handles that the kernel will allocate. When you get lots of error messages about running out of file handles, you might want to increase this limit ( source ). If your program runs in such a restrictive environment then it would be good to keep the file open only when needed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM