[英]How to achieve maximum write speed with Python?
I'm writing a program that will do high speed data acquisition. 我正在写一个程序,可以进行高速数据采集。 The acquisition card can run at up 6.8 GB/s (It's on PCIe3 x8). 采集卡可以高达6.8 GB / s的速度运行(在PCIe3 x8上)。 Right now I'm trying to stream to a RAM disk to see the max write speed I can achieve with Python. 现在,我正在尝试流式传输到RAM磁盘,以查看我可以使用Python达到的最大写入速度。
The card is going to give me 5-10 MB blocks, which I can then write somewhere. 该卡将给我5-10 MB的块,然后我可以将其写在某个地方。
I wrote this piece of code, which writes a 10MB block 500 times to a binary file. 我编写了这段代码,将500 MB的10MB块写入二进制文件。 I'm using Anaconda2 on Windows 7 64-bit, and I used the profiler from Anaconda's accelerate. 我在64位Windows 7上使用Anaconda2,并且使用了Anaconda的加速器中的探查器。
block = 'A'*10*1024*1024
filename = "R:\\test"
f = os.open(filename, os.O_CREAT| os.O_BINARY|os.O_TRUNC|os.O_WRONLY|os.O_SEQUENTIAL)
p = profiler.Profile(signatures=False)
p.enable()
start = time.clock()
for x in range(500):
os.write(f,block)
transferTime_sec = time.clock() - start
p.disable()
p.print_stats()
print('\nwrote %f MB' % (os.stat(filename).st_size/(1024*1024)))
I tested this on a RAM disk (R:\\) and I got the following output: 我在RAM磁盘(R:\\)上进行了测试,得到以下输出:
So I figured, I'm getting something around 2.5 GB/s on RAM. 所以我想,我在RAM上获得了大约2.5 GB / s的速度。 which is not bad but far from max RAM throughput still, but the numbers are consistent. 这还不错,但仍离最大RAM吞吐量还很远,但是数量是一致的。 So the low throughput is one problem. 因此,低吞吐量是一个问题。
The second problem is, when I test this code with a PCIe SSD (which I had benchmarked with another software at 1090 MB/s sequential write), it gives comparable figures. 第二个问题是,当我使用PCIe SSD测试该代码时(我已经使用另一种软件以1090 MB / s的顺序写入进行了基准测试),它给出了可比的数字。
This makes me think that it's caching and/or buffering (?) and so I'm just not measuring actual IO. 这使我认为它是缓存和/或缓冲(?),所以我只是不测量实际的IO。 I'm not sure what's going on really as I'm fairly new to python. 我不确定到底是怎么回事,因为我对python还很陌生。
So my main question is how to achieve max write speeds, and a side question is why am I getting these numbers? 所以我的主要问题是如何达到最大写入速度,而另一个问题是为什么我要获得这些数字?
I don't know if you are still looking after this issue, but I found your question interesting so I gave it a try on a Linux laptop. 我不知道您是否仍在照看此问题,但是我发现您的问题很有趣,因此尝试在Linux笔记本电脑上进行尝试。
I ran your code on python 3.5 and found that you need to have os.O_SYNC
flag as well to avoid the buffering issue (basically the os.write
function won't return before all data have been written on the disk). 我在python 3.5上运行了您的代码,发现您还需要具有os.O_SYNC
标志以避免缓冲问题(基本上,在将所有数据都写入磁盘之前, os.write
函数不会返回)。 I also replace time.clock()
by time.time()
which give me better results. 我也用time.time()
替换了time.clock()
,这给了我更好的结果。
import os
import time
import cProfile
def ioTest():
block = bytes('A'*10*1024*1024, 'utf-8')
filename = 'test.bin'
f = os.open(filename, os.O_WRONLY | os.O_CREAT | os.O_TRUNC |
os.O_SYNC)
start = time.time()
for x in range(500):
os.write(f,block)
os.close(f)
transferTime_sec = time.time() - start
msg = 'Wrote {:0f}MB in {:0.03f}s'
print(msg.format(os.stat(filename).st_size/1024/1024,
transferTime_sec))
cProfile.run('ioTest()')
Also, this post talk about using the os.O_DIRECT
flag, which will use DMA and avoid bottlenecks. 此外,该职位说说使用os.O_DIRECT
标志,将使用DMA和避免瓶颈。 I had to use the mmap module to make it work on my machine: 我必须使用mmap模块才能使其在我的机器上运行:
import os
import time
import cProfile
import mmap
def ioTest():
m = mmap.mmap(-1, 10*1024*1024)
block = bytes('A'*10*1024*1024, 'utf-8')
m.write(block) filename = 'test.bin'
f = os.open(filename, os.O_WRONLY | os.O_CREAT | os.O_TRUNC |
os.O_SYNC, os.O_DIRECT)
start = time.time()
for x in range(500):
os.write(f,m)
os.close(f)
transferTime_sec = time.time() - start
msg = 'Wrote {:0f}MB in {:0.03f}s.'
print(msg.format(os.stat(filename).st_size/1024/1024,
transferTime_sec))
cProfile.run('ioTest()')
This reduced the writing time on my machine by 40%... not bad. 这将我的机器上的写入时间减少了40%...不错。 I didn't used os.O_SEQUENTIAL
and os.O_BINARY
that are not available on my machine. 我没有使用在我的机器上不可用的os.O_SEQUENTIAL
和os.O_BINARY
。
[Edit] : I found how to use the os.O_DIRECT flag from this site which explains it very well and in depth. [编辑] :我从该站点找到了如何使用os.O_DIRECT标志,它很好地并且深入地解释了它。 I strongly recommend reading this if you are interesting in performance and direct IO in Python. 如果您对Python的性能和直接IO感兴趣,我强烈建议您阅读。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.