jupyterLab %%timeit not consistent with python timeit

Question

I wanted to test if type hinting had an influence on how long did the code take to run. It probably would add a tiny bit of time because the compiler has to ignore it, and that takes time but I wanted to see how unsignificant this was.

To do this, I executed this on jupyterLab:

However, one of my classmates tried this without jupyterLab and found this:

Does someone have an explanation as to why this would be happening?

Functions used:

def func(a, b):
    return a + b
func(6, 7)

and:

def func(a: int, b: int) -> int:
    return a + b
func(6, 7)

Answer 1

As you noticed yourself, "The function definition is what takes more time" .

Why you saw a difference

Pythengers, disassemble!

import dis

dis.dis('''
def func(a, b):
    return a + b
''')

Output (using Python 3.10.4):

  2           0 LOAD_CONST               0 (<code object func at 0x00000243938989D0, file "<dis>", line 2>)
              2 LOAD_CONST               1 ('func')
              4 MAKE_FUNCTION            0
              6 STORE_NAME               0 (func)
              8 LOAD_CONST               2 (None)
             10 RETURN_VALUE

Disassembly of <code object func at 0x00000243938989D0, file "<dis>", line 2>:
  3           0 LOAD_FAST                0 (a)
              2 LOAD_FAST                1 (b)
              4 BINARY_ADD
              6 RETURN_VALUE

Loads the compiled code object, makes a function object from it, and assigns it to the name func .

With the annotated version, we instead get this:

  2           0 LOAD_CONST               0 ('a')
              2 LOAD_NAME                0 (int)
              4 LOAD_CONST               1 ('b')
              6 LOAD_NAME                0 (int)
              8 LOAD_CONST               2 ('return')
             10 LOAD_NAME                0 (int)
             12 BUILD_TUPLE              6
             14 LOAD_CONST               3 (<code object func at 0x00000243940F0A80, file "<dis>", line 2>)
             16 LOAD_CONST               4 ('func')
             18 MAKE_FUNCTION            4 (annotations)
             20 STORE_NAME               1 (func)
             22 LOAD_CONST               5 (None)
             24 RETURN_VALUE

Disassembly of <code object func at 0x00000243940F0A80, file "<dis>", line 2>:
  3           0 LOAD_FAST                0 (a)
              2 LOAD_FAST                1 (b)
              4 BINARY_ADD
              6 RETURN_VALUE

So when that gets executed, some additional stuff gets done for the annotations, which costs additional time.

Why your classmate didn't see a difference

Why does it not make a difference for your classmate's way of measuring? Because they're not really measuring it. Try printing something in a.py :

print('importing a.py ...')
def func(a, b):
    return a + b
func(6, 7)

Then try it again:

> python -m timeit -n 10000000 "import a"
importing a.py ...
10000000 loops, best of 5: 370 nsec per loop
>

Ten million loops, but our message only got printed once . Because Python caches imports and doesn't reimport what's already imported. So while this does measure a single execution of the function definition, that's utterly insignificant among executing ten million import statements. Really your classmate didn't measure the code they intended to measure, but measured import statements (almost all of which got recognized as reimports and then ignored).

Answer 2

I think magic functions such as %%timeit cause the notebook to interpret the cell. Running your code in both a command line and a notebook, I get similar results to yours.

However, if I define the functions afunc and bfunc in one cell and then %%timeit each in a different cell, results are consistent with each other again.

[1]
def afunc(a: int, b: int) -> int:
    return a + b

def bfunc(a, b):
    return a + b

[2]
%%timeit
afunc(1,2)
40.9 ns ± 0.131 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

[3]
%%timeit
bfunc(1,2)
42.9 ns ± 0.0417 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

No significant difference, whereas

[1]
%%timeit
def afunc(a: int, b: int) -> int:
    return a + b
afunc(1,2)

145 ns ± 0.765 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

[2]
%%timeit
def bfunc(a, b):
    return a + b
bfunc(1,2)
71.2 ns ± 0.341 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Gives us a massive difference

We can take it further and do this:

[1]
%%timeit
def afunc(a: int, b: int) -> int:
    return a + b
111 ns ± 0.358 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

[2]
%%timeit
def bfunc(a, b):
    return a + b
32.6 ns ± 0.255 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Here it becomes obvious that interpretation is what is being measured as no user function call is being executed, just the definition of the function itself. We observe the same discrepancy, minus the ~40ns to run either function.

Simpler code in a console seems to confirm that this isn't specific to JupyterLab, although this code is much simpler to fit exec 's requirements:

C:\Users\Ben>python -m timeit "exec('a: int = 1')"
50000 loops, best of 5: 4.91 usec per loop

C:\Users\Ben>python -m timeit "exec('a = 1')"
50000 loops, best of 5: 4.02 usec per loop

jupyterLab %%timeit not consistent with python timeit

Question

2 answers

solution1
2 ACCPTED 2022-05-23 15:30:51

Why you saw a difference

Why your classmate didn't see a difference

solution2
0 2022-05-19 23:53:44

jupyterLab %%timeit not consistent with python timeit

Question

2 answers

solution1 2 ACCPTED 2022-05-23 15:30:51

Why you saw a difference

Why your classmate didn't see a difference

solution2 0 2022-05-19 23:53:44

solution1
2 ACCPTED 2022-05-23 15:30:51

solution2
0 2022-05-19 23:53:44