简体   繁体   English

在附加模式下我的文件是否在RAM中打开?

[英]Whether my file is opened in RAM while in append mode?

I have written a code which keep on append the file. 我写了一个代码,继续附加文件。 Here is the code for it: 这是它的代码:

writel = open('able.csv','a',encoding='utf-8',errors='ignore')
with open('test','r',encoding='utf-8',errors='ignore') as file:
    for i in file.readlines():
        data = functionforprocess(i)
        if data is not "":
            writel.write(data)
        if count% 10000 == 0:
            log = open('log','w')
            log.write(str(count))
            log.close()

My question is: whether the file that I have opened in the append mode is available in RAM? 我的问题是:我在append模式下打开的文件是否在RAM中可用? Does that file is acting like a buffer, means If I store the data in variable and then write the variable to file is equal to open a file in append mode and write directly? 该文件是否像缓冲区一样,意味着如果我将data存储在变量中然后将变量写入文件等于以追加模式打开文件并直接写入?

Kindly, get me out of this confusion. 请注意,让我摆脱这种困惑。

Appending is a basic function of file I/O and is carried out by the operating system. 附加是文件I / O的基本功能,由操作系统执行。 For instance, fopen with mode a or a+ is part of the POSIX standard. 例如,fopen with mode aa+是POSIX标准的一部分。 With file I/O, the OS will also tend to buffer reads and writes; 对于文件I / O,操作系统也会倾向于缓冲读写操作; for instance, for most purposes it's not necessary to make sure that the data that you've passed to write is actually on the disk all the time. 例如,对于大多数的目的,没有必要,以确保您已通过将数据write实际上是在磁盘上所有的时间。 Sometimes it sits in a buffer somewhere in the OS; 有时它位于操作系统某处的缓冲区中; sometimes the OS dumps these buffers out to disk. 有时,操作系统会将这些缓冲区转储到磁盘上。 You can force writes using fsync if it's important to you; 如果对你很重要,你可以使用fsync强制写入; this is also a really good reason to make sure that you always call close on your open file objects when you're done with them (or use a context manager); 这也是一个非常好的理由,以确保在完成它们时(或使用上下文管理器)总是在打开的文件对象上调用close ; if you forget, you might get weird behaviour because of those buffers hanging around in the OS. 如果你忘记了,你可能会因为操作系统中的那些缓冲区而产生奇怪的行为。

So, to answer your question. 所以,回答你的问题。 The file that you opened is most likely in RAM at any given moment. 您打开的文件很可能在任何给定时刻都在RAM中。 However, as far as I know, it's not available to you. 但是,据我所知,你无法使用它。 You can interact with the data in the file using file I/O methods, but it's not like there's a buffer that you can get the memory address of, and read back what you just wrote. 您可以使用文件I / O方法与文件中的数据进行交互,但它不像是一个缓冲区,您可以获取内存地址,并回读您刚才写的内容。 As to if append-mode writing is equivalent to storing something in a buffer and then writing to disk, I guess I would say no. 至于附加模式写入是否等同于将某些内容存储在缓冲区然后写入磁盘,我想我会说不。 Any kind of file I/O write will probably be buffered the same way by the OS, and the reason this is efficient is that the OS gets to make the decision on when to flush the buffers. 操作系统可能会以相同的方式缓冲任何类型的文件I / O写入,并且这是有效的原因是操作系统可以决定何时刷新缓冲区。 If you store things in a variable and then write them out atomically to disk, you get to decide when the writes take place. 如果将事物存储在变量中,然后以原子方式将它们写入磁盘,则可以决定何时进行写入。

The signature of the open function is: open函数的签名是:

open(file, mode=’r’, buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

If you open in "a" (append) mode, it means: open for writing, appending to the end of the file if it exists. 如果以“a”(追加)模式打开,则表示:打开以进行写入,如果存在则附加到文件的末尾。 There is nothing about buffering. 缓冲没有任何意义。

Buffering can be customized with the buffering parameter. 可以使用缓冲参数自定义缓冲 Quoting the doc: 引用文档:

buffering is an optional integer used to set the buffering policy. buffering是一个可选的整数,用于设置缓冲策略。 Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer. 传递0以切换缓冲关闭(仅允许在二进制模式下),1选择行缓冲(仅在文本模式下可用),并且整数> 1以指示固定大小的块缓冲区的大小(以字节为单位)。 When no buffering argument is given, the default buffering policy works as follows: 如果没有给出缓冲参数,则默认缓冲策略的工作方式如下:

  • Binary files are buffered in fixed-size chunks; 二进制文件以固定大小的块缓冲; the size of the buffer is chosen using a heuristic trying to determine the underlying device's “block size” and falling back on io.DEFAULT_BUFFER_SIZE. 使用启发式方法选择缓冲区的大小,尝试确定底层设备的“块大小”并回退到io.DEFAULT_BUFFER_SIZE。 On many systems, the buffer will typically be 4096 or 8192 bytes long. 在许多系统上,缓冲区通常为4096或8192字节长。
  • “Interactive” text files (files for which isatty() returns True) use line buffering. “交互式”文本文件(isatty()返回True的文件)使用行缓冲。 Other text files use the policy described above for binary files. 其他文本文件使用上述策略用于二进制文件。

In your example, your file is opened for append in text mode. 在您的示例中,打开文件以便以文本模式追加。

So, only a chunk of your data is stored in RAM during writing. 因此,在写入期间,只有一大块数据存储在RAM中。 If you write a "big" data, it will be divided into several chunks. 如果你写一个“大”数据,它将被分成几个块。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM