[英]Parralelized code runs much slower in Python than in Matlab
[英]python function(or a code block) runs much slower with a time interval in a loop
我注意到python中的一種情況,當嵌套在循環中的代碼塊連續運行時,它比以.sleep()
時間間隔運行要快得多。
我不知道原因和可能的解決方案 。
我想這與CPU緩存或cPython VM的某種機制有關。
'''
Created on Aug 22, 2015
@author: doge
'''
import numpy as np
import time
import gc
gc.disable()
t = np.arange(100000)
for i in xrange(100):
#np.sum(t)
time.sleep(1) #--> if you comment this line, the following lines will be much faster
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
結果:
without sleep in loop, time consumed: 50us
with a sleep in loop, time consumed: >150us
.sleep()
一些缺點是它會釋放CPU,因此我在下面提供了與C
代碼完全相同的版本:
'''
Created on Aug 22, 2015
@author: doge
'''
import numpy as np
import time
import gc
gc.disable()
t = np.arange(100000)
count = 0
for i in xrange(100):
count += 1
if ( count % 1000000 != 0 ):
continue
#--> these three lines make the following lines much slower
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
另一個實驗:( 我們刪除了for循環)
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
...
st = time.time()
np.sum(t)
print (time.time() - st)*1e6
結果:
execution time decreased from 150us -> 50us gradually.
and keep stable in 50us.
為了找出這是否是CPU緩存問題,我編寫了C
副本。 並發現這種現象不會發生。
#include <iostream>
#include <sys/time.h>
#define num 100000
using namespace std;
long gus()
{
struct timeval tm;
gettimeofday(&tm, NULL);
return ( (tm.tv_sec % 86400 + 28800) % 86400 )*1000000 + tm.tv_usec;
}
double vec_sum(double *v, int n){
double result = 0;
for(int i = 0;i < n;++i){
result += v[i];
}
return result;
}
int main(){
double a[num];
for(int i = 0; i < num; ++i){
a[i] = (double)i;
}
//for(int i = 0; i < 1000; ++i){
// cout<<a[i]<<"\n";
//}
int count = 0;
long st;
while(1){
++count;
if(count%100000000 != 0){ //---> i use this line to create a delay, we can do the same way in python, result is the same
//if(count%1 != 0){
continue;
}
st = gus();
vec_sum(a,num);
cout<<gus() - st<<endl;
}
return 0;
}
結果:
time stable in 250us, no matter in "count%100000000" or "count%1"
(不是答案-太長了,無法發表評論)
我做了一些實驗,並通過timeit
了(稍微簡單一些)。
from timeit import timeit
import time
n_loop = 15
n_timeit = 10
sleep_sec = 0.1
t = range(100000)
def with_sleep():
for i in range(n_loop):
s = sum(t)
time.sleep(sleep_sec)
def without_sleep():
for i in range(n_loop):
s = sum(t)
def sleep_only():
for i in range(n_loop):
time.sleep(sleep_sec)
wo = timeit(setup='from __main__ import without_sleep',
stmt='without_sleep()',
number = n_timeit)
w = timeit(setup='from __main__ import with_sleep',
stmt='with_sleep()',
number = n_timeit)
so = timeit(setup='from __main__ import sleep_only',
stmt='sleep_only()',
number = n_timeit)
print(so - n_timeit*n_loop*sleep_sec, so)
print(w - n_timeit*n_loop*sleep_sec, w)
print(wo)
結果是:
0.031275457000447204 15.031275457000447
1.0220358229998965 16.022035822999896
0.41462676399987686
第一行只是檢查sleep函數是否使用了大約n_timeit*n_loop*sleep_sec
秒。 因此,如果此值很小 -可以。
但是如您所見-您的發現仍然存在:具有睡眠功能的循環(減去睡眠所用的時間)比沒有睡眠的循環所花費的時間更多...
我不認為python可以優化循環而無需睡眠(交流編譯器可能;永不使用變量s
)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.