简体   繁体   English

在脚本执行期间将值写入csv

[英]write values into csv during script execution

I have a simple script that reads values from one csv, runs some internal function on them that takes 2-3 seconds each time, and then writes the results into another csv file. 我有一个简单的脚本,该脚本从一个csv读取值,对它们运行一些内部函数,每次需要2-3秒,然后将结果写入另一个csv文件。

Here is what it looks like, minus the internal function I referenced. 这是它的样子,减去了我引用的内部函数。

import csv
import time

pause = 3

with open('input.csv', mode='r') as input_file, \
     open('output.csv', mode='w') as output_file:
    input_reader = csv.DictReader(input_file)
    output_writer = csv.writer(output_file, delimiter=',', quotechar='"',
                               quoting=csv.QUOTE_MINIMAL)
    count = 1
    for row in input_reader:
        row['new_value'] = "result from function that takes time"
        output_writer.writerow( row.values() )
        print( 'Processed row: ' + str( count ) )
        count = count + 1
        time.sleep(pause)

The problem I face is that the output.csv file remains blank until everything is finished executing. 我面临的问题是,在完成所有操作之前, output.csv文件将保持空白。

I'd like to access and make use of the file elsewhere whilst this long script runs. 这个长脚本运行时,我想在其他地方访问和使用该文件。

Is there a way I can prevent the delay of writing of the values into the output.csv ? 有没有办法可以防止将值写入output.csv

Edit: here is an dummy csv file for the script above: 编辑:这是上面脚本的虚拟csv文件:

value
43t34t34t
4r245r243
2q352q352
gergmergre
435q345q35

I think you want to look at the buffering option - this is what controls how often Python flushes to a file. 我认为您想看一下缓冲选项-这是控制Python刷新到文件的频率的原因。

Specifically setting open('name','wb',buffering=0) will reduce buffering to minimum, but maybe you want to set it to some thing else that makes sense. 专门设置open('name','wb',buffering=0)会将缓冲减少到最小,但是也许您想将其设置为其他有意义的东西。

buffering is an optional integer used to set the buffering policy. buffering是用于设置缓冲策略的可选整数。 Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer. 传递0来关闭缓冲(仅在二进制模式下允许),传递1来选择行缓冲(仅在文本模式下可用),并传递一个大于1的整数以指示固定大小的块缓冲区的字节大小。 When no buffering argument is given, the default buffering policy works as follows: 如果未指定任何缓冲参数,则默认的缓冲策略如下:

  • Binary files are buffered in fixed-size chunks; 二进制文件以固定大小的块缓冲; the size of the buffer is chosen using a heuristic trying to determine the underlying device's “block size” and falling back on io.DEFAULT_BUFFER_SIZE. 缓冲区的大小是通过试探法来确定底层设备的“块大小”,然后使用io.DEFAULT_BUFFER_SIZE来选择的。 On many systems, the buffer will typically be 4096 or 8192 bytes long. 在许多系统上,缓冲区的长度通常为4096或8192字节。
  • “Interactive” text files (files for which isatty() returns True) use line buffering. “交互式”文本文件(isatty()返回True的文件)使用行缓冲。 Other text files use the policy described above for binary files. 其他文本文件将上述策略用于二进制文件。

See also How often does python flush to a file? 另请参见python多长时间刷新一次文件? .

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM