简体   繁体   中英

Appending to tuple during for in loop

I need to modify a tuple during a for in loop, such that the iterator iterates on the tuple.

From my understanding, tuples are immutable; so tup = tup + (to_add,) is just reassigning tup , not changing the original tuple. So this is tricky.

Here is a test script:

tup = ({'abc': 'a'}, {'2': '2'})
blah = True
to_add = {'goof': 'abcde'}
for i in tup:
    if blah:
        tup = tup + (to_add,)
        blah = False
    print(i)

Which prints:

{'abc': 'a'}
{'2': '2'}

What I would like is for it to print:

{'abc': 'a'}
{'2': '2'}
{'goof': 'abcde'}

From what I understand, I need to "repoint" the implicit tuple iterator mid-script so that it is pointing at the new tuple instead. (I know this is a seriously hacky thing to be doing).

This script accesses the tuple_generator in question:

import gc

tup = ({'abc': 'a'}, {'2': '2'})
blah = True
to_add = {'goof': 'abcde'}
for i in tup:
    if blah:
        tup = tup + (to_add,)
        blah = False
        refs = gc.get_referrers(i)
        for ref in refs:
            if type(ref) == tuple and ref != tup:
                refs_to_tup = gc.get_referrers(ref)
                for j in refs_to_tup:
                    if str(type(j)) == "<class 'tuple_iterator'>":
                        tuple_iterator = j

    print(i)

How can I modify this tuple_generator so that it points at the new tup, and not the old? Is this even possible?

I am aware that this is a really strange situation, I cannot change that tup is a tuple or that I need to use an implicit for in , as I am trying to plug into code that I cannot change.

There is no way-either portably or specifically within CPython—to do what you're trying to do from within Python, even via undocumented internals of the tuple_iterator object. The tuple reference is stored in a variable that isn't exposed to Python, and (unlike the stored index) isn't modified by __setstate__ or any other method.

However, if you're willing to start monkeying with C pointers behind CPython's back, and you know how to debug the inevitable segfaults…

Under the covers, there's a C struct representing tuple_iterator . I think it's either seqiterobject , or a struct with the exact same shape, but you should read through the tupleobject source code to make sure.

Here's what that type looks like in C:

typedef struct {
    PyObject_HEAD
    Py_ssize_t it_index;
    PyObject *it_seq; /* Set to NULL when iterator is exhausted */
} seqiterobject;

So, what happens if you create a ctypes.Structure subclass that's the same size as this, something like this:

class seqiterobject(ctypes.Structure):
    _fields_ = (
        ('ob_refcnt', ctypes.c_ssize_t),
        ('ob_type', ctypes.c_void_p),
        ('it_index', ctypes.c_ssize_t),
        ('it_seq', ctypes.POINTER(ctypes.pyobject)))

… and then do this:

seqiter = seqiterobject.from_address(id(j))

… and then do this:

seqiter.it_seq = id(other_tuple)

…? Well, you probably corrupt the heap by underreferencing the new value (and also leak the old one), so you'll need to incref the new value and decref the old value first.

But, if you do that… most likely, either it'll segfault the next time you call __next__ , or it'll work.

If you want more example code that does similar things, see superhackyinternals . Other than the fact that seqiterobject is not even a public type, so this is even more hacky, everything else is basically the same.

You could write your own coroutine and send the new tup to it.

def coro(iterable):
    iterable = iter(iterable)
    while True:
        try:
            v = next(iterable)
            i = yield v
        except StopIteration:
            break
        if i:
            yield v
            iterable = it.chain(iterable, i)

Then this works as you describe:

In []:   
blah = True
tup = ({'abc': 'a'}, {'2': '2'})
to_add = {'goof': 'abcde'}

c = coro(tup)
for i in c:
    if blah:
        i = c.send((to_add,))
        blah = False
    print(i)

Out[]:
{'abc': 'a'}
{'2': '2'}
{'goof': 'abcde'}

I'm sure there are lots of edge cases I'm missing in the above but it should give you an idea of how it can be done.

Since you plan on modifying the tuple inside the loop, you are probably better off using a while loop keeping track of the current index rather then relying on an iterator. Iterators are only good for looping through collections that do not get added/removed to in the loop.

If you run this below example, the resulting tup object has the items added to it, all while looping through 3 times.

tup = ({'abc': 'a'}, {'2': '2'})
blah = True
to_add = {'goof': 'abcde'}

i = 0
while i < len(tup):
    cur = tup[i]
    if blah:
        tup = tup + (to_add,)
        blah = False
    i += 1

print(tup)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM