简体   繁体   English

python是否自动并行化IO和CPU或内存绑定部分?

[英]Is python automagically parallelizing IO- and CPU- or memory-bound sections?

This is a follow-up questions on a previous one . 这是前一个问题的后续问题。

Consider this code, which is less toyish than the one in the previous question (but still much simpler than my real one) 考虑一下这个代码,它比前一个问题中的代码少一些 (但仍然比我真实的简单得多)

import sys
data=[]

for line in open(sys.argv[1]):
    data.append(line[-1])

print data[-1]

Now, I was expecting a longer run time (my benchmark file is 65150224 lines long), possibly much longer. 现在,我期待更长的运行时间(我的基准文件长度为65150224行),可能更长。 This was not the case, it runs in ~ 2 minutes on the same hw as before! 事实并非如此,它与以前一样在约2分钟内运行!

Is it data.append() very lightweight? 是data.append()非常轻量级? I don't believe so, thus I wrote this fake code to test it: 我不相信,因此我写了这个假代码来测试它:

data=[]
counter=0
string="a\n"

for counter in xrange(65150224):
    data.append(string[-1])

print data[-1]

This runs in 1.5 to 3 minutes (there is strong variability among runs) 这运行时间为1.5到3分钟(运行之间存在很大的差异)

Why don't I get 3.5 to 5 minutes in the former program? 为什么我不能在前一个程序中获得3.5到5分钟? Obviously data.append() is happening in parallel with the IO. 显然,data.append()与IO并行发生。

This is good news! 这是个好消息!

But how does it work? 但它是如何工作的? Is it a documented feature? 它是一个记录的功能吗? Is there any requirement on my code that I should follow to make it works as much as possible (besides load-balancing IO and memory/CPU activities)? 我的代码是否有任何要求我应该遵循以使其尽可能地工作(除了负载平衡IO和内存/ CPU活动)? Or is it just plain buffering/caching in action? 或者只是简单的缓冲/缓存?

Again, I tagged "linux" this question, because I'm interested only in linux-specific answers. 再次,我在这个问题上标记了“linux”,因为我只对linux特定的答案感兴趣。 Feel free to give OS-agnostic, or even other-OS answers, if you think it's worth doing. 如果您认为值得做,请随意提供与操作系统无关的内容,甚至是其他操作系统的答案。

Obviously data.append() is happening in parallel with the IO. 显然,data.append()与IO并行发生。

I'm afraid not. 恐怕不是。 It is possible to parallelize IO and computation in Python, but it doesn't happen magically. 可以并行IO和计算在Python,但它不会神奇地出现。

One thing you could do is use posix_fadvise(2) to give the OS a hint that you plan to read the file sequentially ( POSIX_FADV_SEQUENTIAL ). 您可以做的一件事是使用posix_fadvise(2)为操作系统提供一个提示,即您计划按顺序读取文件( POSIX_FADV_SEQUENTIAL )。

In some rough tests doing "wc -l" on a 600 meg file (an ISO) the performance increased by about 20%. 在600兆字节文件(ISO)上执行“wc -l”的一些粗略测试中,性能提高了约20%。 Each test was done immediately after clearing the disk cache. 清除磁盘缓存后立即完成每个测试。

For a Python interface to fadvise see python-fadvise . 对于fadvise的Python接口,请参阅python-fadvise

How big are the lines in your file? 你文件中的行有多大? If they're not very long (anything under about 1K probably qualifies) then you're likely seeing performance gains because of input buffering. 如果它们不是很长(大约1K以下的任何东西可能有资格)那么你可能会因为输入缓冲而看到性能提升。

Why do you think list.append() would be a slower operation? 为什么你认为list.append()会更慢? It is extremely fast, considering the internal pointer arrays used by lists to hold references to the objects in them are allocated in increasingly large blocks, so that every append does not actually re-allocate the array, and most can simply increment the length counter and set a pointer and incref. 它非常快,考虑到列表用来保存对其中对象的引用的内部指针数组在越来越大的块中分配,因此每个附加实际上不会重新分配数组,并且大多数可以简单地增加长度计数器和设置指针和增量。

I don't see any evidence that "data.append() is happening in parallel with the IO." 我没有看到任何证据表明“data.append()与IO并行发生。” Like Benji, I don't think this is automatic in the way you think. 像Benji一样,我认为这不像你想象的那样是自动的。 You showed that doing data.append(line[-1]) takes about the same amount of time as lc = lc + 1 (essentially no time at all, compared to the IO and line splitting). 您表明,执行data.append(line [-1])所花费的时间与lc = lc + 1大致相同(与IO和行拆分相比,基本上没有时间)。 It's not really surprising that data.append(line[-1]) is very fast. data.append(line [-1])非常快,这并不奇怪。 One would expect the whole line to be in a fast cache, and as noted append prepares buffers ahead of time and only rarely has to reallocate. 人们会期望整行都在快速缓存中,并且如前所述,append会提前准备缓冲区,并且很少需要重新分配。 Moreover, line[-1] will always be '\\n', except possibly for the last line of the file (no idea if Python optimizes for this). 此外,行[-1]将始终为'\\ n',除了可能是文件的最后一行(不知道Python是否为此优化)。

The only part I'm a little surprised about is that the xrange is so variable. 我有点惊讶的唯一部分是xrange是如此可变。 I would expect it to always be faster, since there's no IO, and you're not actually using the counter. 我希望它总是更快,因为没有IO,你实际上并没有使用计数器。

如果您的运行时间因第二个示例的运行时间而异,我怀疑您的计时方法或外部影响(其他进程/系统负载)会将时间偏移到不提供任何可靠信息的程度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM