简体   繁体   English

从itertools.cycle中提取列表

[英]Extract a list from itertools.cycle

I have a class which contains a itertools.cycle instance which I would like to be able to copy. 我有一个包含itertools.cycle实例的类,我希望能够复制它。 One approach (the only one I can come up with), is to extract the initial iterable (which was a list), and store the position that the cycle is at. 一种方法(我能想出的唯一方法)是提取初始可迭代(这是一个列表),并存储循环所处的位置。

Unfortunately I am unable to get hold of the list which I used to create the cycle instance, nor does there seem to be an obvious way to do it: 不幸的是我无法掌握我用来创建循环实例的列表,似乎也没有明显的方法来做到这一点:

import itertools
c = itertools.cycle([1, 2, 3])
print dir(c)
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', 
 '__hash__', '__init__', '__iter__', '__new__', '__reduce__', 
 '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', 
 '__subclasshook__', 'next']

I can come up with some half reasonable reasons why this would be disallowed for some types of input iterables, but for a tuple or perhaps even a list (mutability might be a problem there), I can't see why it wouldn't be possible. 我可以提出一些合理的理由,为什么对某些类型的输入迭代不允许这样做,但是对于一个元组甚至一个列表(可变性可能是一个问题),我不明白为什么它不会可能。

Anyone know if its possible to extract the non-infinite iterable out of an itertools.cycle instance. 任何人都知道是否可以从itertools.cycle实例中提取非无限可迭代。 If not, anybody know why this idea is a bad one? 如果没有,任何人都知道为什么这个想法很糟糕?

It's impossible. 不可能。 If you look at itertools.cycle code you'll see that it does not store a copy of the sequence. 如果你查看itertools.cycle代码,你会发现它没有存储序列的副本。 It only create an iterable and store the values contained in the iterable in a newly created list: 它只创建一个iterable并将迭代中包含的值存储在新创建的列表中:

static PyObject *
cycle_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    PyObject *it;
    PyObject *iterable;
    PyObject *saved;
    cycleobject *lz;

    if (type == &cycle_type && !_PyArg_NoKeywords("cycle()", kwds))
        return NULL;

    if (!PyArg_UnpackTuple(args, "cycle", 1, 1, &iterable))
        return NULL;
    /* NOTE: they do not store the *sequence*, only the iterator */
    /* Get iterator. */
    it = PyObject_GetIter(iterable);
    if (it == NULL)
        return NULL;

    saved = PyList_New(0);
    if (saved == NULL) {
        Py_DECREF(it);
        return NULL;
    }

    /* create cycleobject structure */
    lz = (cycleobject *)type->tp_alloc(type, 0);
    if (lz == NULL) {
        Py_DECREF(it);
        Py_DECREF(saved);
        return NULL;
    }
    lz->it = it;
    lz->saved = saved;
    lz->firstpass = 0;

    return (PyObject *)lz;
}

This means that when doing: 这意味着在做的时候:

itertools.cycle([1,2,3])

The list you create has only 1 reference, that is kept in the iterator used by cycle. 您创建的列表只有一个引用,它保存在循环使用的迭代器中。 When the iterator is exhausted the iterator gets deleted and a new iterator is created: 当迭代器耗尽时,迭代器将被删除并创建一个新的迭代器:

    /* taken from the "cycle.next" implementation */
    it = PyObject_GetIter(lz->saved);
    if (it == NULL)
        return NULL;
    tmp = lz->it;
    lz->it = it;
    lz->firstpass = 1;
    Py_DECREF(tmp);   /* destroys the old iterator */

Which means that after doing one cycle the list is destroyed. 这意味着在执行一个循环后,列表将被销毁。

Anyway if you need access to this list, just reference it somewhere before calling itertools.cycle . 无论如何,如果你需要访问这个列表,只需在调用itertools.cycle之前在某处引用它。

If you have ways of knowing certain properties of the objects being yielded by cycle then you can deduce the inner list. 如果你有办法知道cycle产生的对象的某些属性,那么你可以推导出内部列表。 For example, if you know that all the objects in the cycle are distinct AND that nothing else is reading from the cycle iterator besides you, then you can simply wait for the first one you see to appear again (testing with is not == ) to terminate the inner list. 例如,如果您知道循环中的所有对象都是不同的并且除了您之外没有其他任何东西从cycle迭代器读取,那么您可以简单地等待您看到的第一个再次出现(使用is == )进行测试终止内部列表。

But without such knowledge, there are no guarantees, and any method you choose to guess what the cycle is will fail in certain cases. 但是,如果没有这些知识,就没有任何保证,而且在某些情况下,您选择猜测循环的任何方法都会失败。

Ok, so I have accepted @Bakuriu's answer, as it is technically correct. 好的,所以我接受了@ Bakuriu的回答,因为它在技术上是正确的。 It is not possible to copy/pickle a itertools.cycle object. 无法复制/ pickle itertools.cycle对象。

I have implemented a subclass of itertools.cycle which is picklable (with a couple of extra bells and whistles to boot). 我已经实现了一个itertools.cycle的子类,它可选择的(带有几个额外的铃声和口哨来启动)。

import itertools


class FiniteCycle(itertools.cycle):
    """
    Cycles the given finite iterable indefinitely. 
    Subclasses ``itertools.cycle`` and adds pickle support.
    """
    def __init__(self, finite_iterable):
        self._index = 0
        self._iterable = tuple(finite_iterable)
        self._iterable_len = len(self._iterable)
        itertools.cycle.__init__(self, self._iterable)

    @property
    def index(self):
        return self._index

    @index.setter
    def index(self, index):
        """
        Sets the current index into the iterable. 
        Keeps the underlying cycle in sync.

        Negative indexing supported (will be converted to a positive index).
        """
        index = int(index)
        if index < 0:
            index = self._iterable_len + index
            if index < 0:
                raise ValueError('Negative index is larger than the iterable length.')

        if index > self._iterable_len - 1:
            raise IndexError('Index is too high for the iterable. Tried %s, iterable '
                             'length %s.' % (index, self._iterable_len))

        # calculate the positive number of times the iterable will need to be moved
        # forward to get to the desired index
        delta = (index + self._iterable_len - self.index) % (self._iterable_len)

        # move the finite cycle on ``delta`` times.
        for _ in xrange(delta):
            self.next()

    def next(self):
        self._index += 1
        if self._index >= self._iterable_len:
            self._index = 0
        return itertools.cycle.next(self)

    def peek(self):
        """
        Return the next value in the cycle without moving the iterable forward.
        """
        return self._iterable[self.index]

    def __reduce__(self):
        return (FiniteCycle, (self._iterable, ), {'index': self.index})

    def __setstate__(self, state):
        self.index = state.pop('index')

Some example usage: 一些示例用法:

c = FiniteCycle([1, 2, 3])

c.index = -1
print c.next() # prints 3

print [c.next() for _ in xrange(4)] # prints [1, 2, 3, 1]

print c.peek() # prints 2
print c.next() # prints 2

import pickle
import cStringIO
serialised_cycle = pickle.dumps(c)

del c

c = pickle.loads(serialised_cycle)

print c.next() # prints 3
print c.next() # prints 1

Feedback welcome. 欢迎反馈。

Thanks, 谢谢,

Depending on how you're using cycle , you could even get away with a custom class wrapper as simple as this: 根据您使用cycle ,您甚至可以使用自定义类包装器,如下所示:

class SmartCycle:
    def __init__(self, x):
        self.cycle = cycle(x)
        self.to_list = x

    def __next__(self):
        return next(self.cycle)

eg 例如

> a = SmartCycle([1, 2, 3])
> for _ in range(4):
>     print(next(a))
1
2
3
1

> a.to_list
[1, 2, 3]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM