简体   繁体   中英

How to find implementation of for - control flow construct in Python

Have searched SO for the same, and seen the github repository of CPython too; but of no avail. It seems that source code implementation of any control flow construct is not visible, but it is not clear why?

In particular need get source code for 'for - control flow construct' in CPython.

In face of no knowledge, all I could do is to use dis module's dis() on a small code, leading to FOR_ITER opcode, which is not understandable by me.
Nor does this opcode lead me into understanding the nested for-loop construct's workings, the reason why I wanted to look into implementation of the same in source code.

>import dis
 def foo():
 for i in range(3):
     for j in range(2):
         print(i,j)
 dis.dis(foo)

 3           0 SETUP_LOOP              44 (to 46)
             2 LOAD_GLOBAL              0 (range)
             4 LOAD_CONST               1 (3)
             6 CALL_FUNCTION            1
             8 GET_ITER
       >>   10 FOR_ITER                32 (to 44)
            12 STORE_FAST               0 (i)

 4          14 SETUP_LOOP              26 (to 42)
            16 LOAD_GLOBAL              0 (range)
            18 LOAD_CONST               2 (2)
            20 CALL_FUNCTION            1
            22 GET_ITER
       >>   24 FOR_ITER                14 (to 40)
            26 STORE_FAST               1 (j)

 5          28 LOAD_GLOBAL              1 (print)
            30 LOAD_FAST                0 (i)
            32 LOAD_FAST                1 (j)
            34 CALL_FUNCTION            2
            36 POP_TOP
            38 JUMP_ABSOLUTE           24
       >>   40 POP_BLOCK
       >>   42 JUMP_ABSOLUTE           10
       >>   44 POP_BLOCK
       >>   46 LOAD_CONST               0 (None)
            48 RETURN_VALUE

The implementation was added in this commit ; here's the part about FOR_ITER :

        case FOR_ITER:
            /* before: [iter]; after: [iter, iter()] *or* [] */
            v = TOP();
            x = PyObject_CallObject(v, NULL);
            if (x == NULL) {
                if (PyErr_ExceptionMatches(
                    PyExc_StopIteration))
                {
                    PyErr_Clear();
                    x = v = POP();
                    Py_DECREF(v);
                    JUMPBY(oparg);
                    continue;
                }
                break;
            }
            PUSH(x);
            continue;

Ignoring the refcounting, a for x in y: loop is equivalent to the following Python code:

# GET_ITER
y_iter = iter(y)

# FOR_ITER
while True:
    try:
        x = next(y_iter)
    except StopIteration:
        break

    # body of for loop
    pass

Considering the subject on current CPython code base (3.8.5):

You can see in your disassembly that every FOR_ITER is preceded by a GET_ITER .

GET_ITER source code (check the numbered comments):

    case TARGET(GET_ITER): {
        /* before: [obj]; after [getiter(obj)] */
        PyObject *iterable = TOP(); // 1.
        PyObject *iter = PyObject_GetIter(iterable); // 2.
        Py_DECREF(iterable); // 3.
        SET_TOP(iter); // 4.
        if (iter == NULL)
            goto error;
        PREDICT(FOR_ITER);
        PREDICT(CALL_FUNCTION);
        DISPATCH();
    }

GET_ITER actually passes to PyObject_GetIter the object iterable which is traversed by for loop.

The code:

  1. Makes iterable points to top of stack of python objects ;
  2. Makes iter points to the iterator returned by PyObject_GetIter call;
  3. Decreases the reference count to the iterable ;
  4. Iterator iter is now on top of stack.

PyObject_GetIter checks if the iterable is an iterator (ie something that consumes iterables) and if so returns it. If it's not, then checks if it's a sequence. If it is a sequence, then it's converted to an iterator. That iterator is the returned value.


FOR_ITER code:

    case TARGET(FOR_ITER): {
        PREDICTED(FOR_ITER);
        /* before: [iter]; after: [iter, iter()] *or* [] */
        PyObject *iter = TOP(); // 1.
        PyObject *next = (*iter->ob_type->tp_iternext)(iter); // 2.
        if (next != NULL) {
            PUSH(next); // 3.
            PREDICT(STORE_FAST);
            PREDICT(UNPACK_SEQUENCE);
            DISPATCH();
        }
        if (_PyErr_Occurred(tstate)) {
            if (!_PyErr_ExceptionMatches(tstate, PyExc_StopIteration)) {
                goto error;
            }
            else if (tstate->c_tracefunc != NULL) {
                call_exc_trace(tstate->c_tracefunc, tstate->c_traceobj, tstate, f);
            }
            _PyErr_Clear(tstate);
        }
        /* iterator ended normally */
        STACK_SHRINK(1);
        Py_DECREF(iter);
        JUMPBY(oparg);
        PREDICT(POP_BLOCK);
        DISPATCH();
    }

Parts of interest:

  1. Gets the iterator from top of stack;
  2. Gets the result of next (ie tp_iternext) method call;
  3. Pushes the result into the stack if it's not NULL .

One thing you should be asking: this only covers a single iteration of the loop. Where is the code that makes the iterator traverse all the items ?

It's the JUMP_ABSOLUTE opcode that makes the iterator run again, this time on the next element. You can see in your original listing that each JUMP_ABSOLUTE is called with the line number of the corresponding FOR_ITER opcode, making the iteration possible.

This answer is a good reference on this subject too.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM