Benchmarking with Python

Question

I am looking to use Python as a system to benchmark other process's time, data I/O, correctness, etc. What I am really interested in is the accuracy of the time. For example

start = time()
subprocess.call('md5sum', somelist)
end = time()
print("%d s", end-start)

Would the sub process add a considerable overhead to the function being called.

EDIT

Well after a few quick tests it appears my best option is to use subprocess however I have noted by included stdout/stderr WITH the communicate call it adds about 0.002338 s extra time to the execution.

Answer 1

Use timeit

import timeit
timeit.timeit('yourfunction')

Update

If you are in linux, then you can use time command, like this:

import subprocess
result = subprocess.Popen(['time', 'md5sum', somelist], shell = False, stdout = subprocess.PIPE, stderr = subprocess.PIPE).communicate()
print result

Answer 2

If your timer script doesn't care about the stdout/stderr of the programs it times, just pipe the output to files or to /dev/null. This avoids the overhead of reading the steams and turning them into python strings.

start = time()
subprocess.call(['md5sum'] + some_list, stdout=open('/dev/null', 'w'),
    stderr=subprocess.STDOUT)
delta = time.time() - start

Avoid communicate() if you don't need to read stdout and stderr as separate streams. It spawns a second thread to read stderr which adds to the overhead.

Benchmarking with Python

Question

2 answers

solution1
0 2013-02-16 01:10:19

solution2
0 2013-02-16 17:56:16

Benchmarking with Python

Question

2 answers

solution1 0 2013-02-16 01:10:19

solution2 0 2013-02-16 17:56:16

solution1
0 2013-02-16 01:10:19

solution2
0 2013-02-16 17:56:16