简体   繁体   English

如何用python编写和更新.txt文件?

[英]How to write and update .txt files with python?

I ve written a script that fetches bitcoin data and saves it in .txt files or in the case where the .txt files exist, it updates them.我编写了一个脚本来获取比特币数据并将其保存在 .txt 文件中,或者在 .txt 文件存在的情况下,它会更新它们。 The .txt files are nodes and relationships connecting the nodes for neo4j. .txt 文件是连接 neo4j 节点的节点和关系。

At the beginning of the script:在脚本的开头:

  1. It checks whether the files exist, so it opens them and appends new lines OR它检查文件是否存在,因此它打开它们并附加新行或
  2. In case the files do not exist, the script creates them and starts appending lines.如果文件不存在,脚本会创建它们并开始附加行。

The .txt files are constantly open, the script writes the new data. .txt 文件不断打开,脚本写入新数据。 The .txt files close when all the data are written or I terminate the execution. .txt 文件在写入所有数据或我终止执行时关闭。

My question is:我的问题是:

Should I open, write, close each .txt file for each iteration and for each .txt file?我应该为每次迭代和每个 .txt 文件打开、写入、关闭每个 .txt 文件吗?

or或者

Should I keep it the way it is now;我应该保持现在的样子吗? open the .txt files, do all the writing, when the writing is done close the .txt file打开 .txt 文件,进行所有写入,写入完成后关闭 .txt 文件

I am saving data from 6013 blocks.我正在保存 6013 个块的数据。 Which way would minimize risk of corrupting the data written in the .txt files?哪种方式可以最大限度地降低损坏 .txt 文件中写入的数据的风​​险?

Keeping files open will be faster.保持文件打开会更快。 In the comments you mentioned that "Loss of data previously written is not an option".在您提到的评论中,“以前写入的数据丢失不是一种选择”。 The probability of corrupting files is higher for open files so open and close file on each iteration is more reliable.打开文件损坏文件的可能性更高,因此每次迭代打开和关闭文件更可靠。 There is also an option to keep data in some buffer and to write/append buffer to file when all data is received or on user/system interrupt or network timeout.还有一个选项可以将数据保留在某个缓冲区中,并在接收到所有数据时或在用户/系统中断或网络超时时将缓冲区写入/附加到文件。

I think keeping the file open will be more efficient, because python won't need to search for the file and open it every time you want to read/write the file.我认为保持文件打开会更有效,因为每次要读/写文件时,python 都不需要搜索文件并打开它。

I guess it should look like this我想它应该是这样的

with open(filename, "a") as file:
    while True:
        data = # get data
        file.write(data)

Run a benchmark and see for yourself would the typical answer for this kind of question.运行一个基准测试并亲自看看此类问题的典型答案。

Nevertheless opening and closing a file does have a cost.然而,打开和关闭文件确实有成本。 Python needs to allocate memory for the buffer and data structures associated with the file and call some operating system functions, eg the open syscall which in turn would search the file in cache or on disk. Python 需要为与文件关联的缓冲区和数据结构分配内存,并调用一些操作系统函数,例如open系统调用,它反过来会在缓存或磁盘上搜索文件。

On the other hand there is a limit on the number of files a program, the user, the whole system, etc can open at the same time.另一方面,程序、用户、整个系统等可以同时打开的文件数量是有限制的。 For example on Linux, the value in /proc/sys/fs/file-max denotes the maximum number of file-handles that the kernel will allocate.例如在 Linux 上, /proc/sys/fs/file-max表示内核将分配的最大文件句柄数。 When you get lots of error messages about running out of file handles, you might want to increase this limit ( source ).当您收到大量有关耗尽文件句柄的错误消息时,您可能希望增加此限制 ( source )。 If your program runs in such a restrictive environment then it would be good to keep the file open only when needed.如果您的程序在如此受限的环境中运行,那么最好仅在需要时保持文件打开。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM