简体   繁体   English

Python 中的一种方法可以不可知地 append() / add() 到集合(或其他接收器)?

[英]A way in Python to agnostically append() / add() to a collection (or other receiver)?

Is there a way in Python to add agnostically to a collection? Python 中是否有一种方法可以不可知地添加到集合中?

Given the prevalence of duck typing I was surprised that the method to add to a list is append(x) but the method to add to a set is add(x) .鉴于鸭子类型的流行,我很惊讶添加到list的方法是append(x)但添加到set的方法是add(x)

I'm writing a family of utility functions that need to build up collections and would ideally like them not to care what type is accumulating the result.我正在编写一系列需要构建 collections 的实用函数,并且理想情况下希望它们不要关心累积结果的类型。 It should at least work for list and set - and ideally for other targets, as long as they know what method to implement.它至少应该适用于listset - 理想情况下适用于其他目标,只要他们知道要实现什么方法。 Essentially, the duck type here is 'thing to which items can be added'.本质上,这里的鸭子类型是“可以添加项目的东西”。

In practice, these utility functions will either be passed the target object to add the results to, or - more commonly - a function that generates new instances of the target type when needed.在实践中,这些实用程序函数将传递给目标 object 以将结果添加到,或者更常见的是 - function 在需要时生成目标类型的新实例。

For example:例如:

def collate(xs, n, f_make=lambda: list()):
    if n < 1:
        raise ValueError('n < 1')
    col = f_make()
    for x in xs:
        if len(col) == n:
            yield col
            col = f_make()
        col.append(x)  # append() okay for list but not for set
    yield col
>>> list(collate(range(6), 3))
[[0, 1, 2], [3, 4, 5]]

>>> list(collate(range(6), 4))
[[0, 1, 2, 3], [4, 5]]

>>> # desired result here: [{1, 2, 3, 4}, {5, 6}]
>>> list(collate(range(6), 4, f_make=lambda: set()))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/paul/proj/mbrain/src/fossil/fn.py", line 42, in collate
    col.append(x)
AttributeError: 'set' object has no attribute 'append'

Here collate() is just a simple example.这里collate()只是一个简单的例子。 I expect there's already a way to achieve this 'collation' in Python.我希望已经有一种方法可以在 Python 中实现这种“整理”。 That's not the real question here.这不是这里真正的问题。

I'm currently using Python 3.8.5.我目前正在使用 Python 3.8.5。

(This is an edited answer. Feel free to look at history for my old answer, but it was not relevant to the question) (这是一个经过编辑的答案。请随意查看我的旧答案的历史记录,但这与问题无关)

The pythonic way is to use the standard library here. pythonic方式是在这里使用标准库。 Rather than manipulating lists, you can use some of the built-in functions that work on iterables more generally.您可以使用一些更普遍地处理可迭代对象的内置函数,而不是操作列表。 As in, itertools .如, itertools

This function is a rough guideline for using itertools .此功能是使用itertools的粗略指南。 It doesn't handle the case where f_make() isn't blank.它不处理f_make()不是空白的情况。 It's also a bit dense and if you're not used to python you probably won't find this super easy to read, but it's technically probably one of the more pythonic ways to do this.它也有点密集,如果你不习惯 python,你可能不会觉得这个超级容易阅读,但从技术上讲,它可能是更 Pythonic 的方法之一。 I'm not certain I'd recommend using this, but it does a lot in a fairly small number of lines, which is sort of the point.我不确定我是否会推荐使用它,但它在相当少的行中做了很多,这就是重点。 I'm sure someone else could find a "more pythonic" approach though.我确信其他人可以找到一种“更 Pythonic”的方法。

def collate(xs, n, f_make=list):
    result = f_make()
    for _, val in itertools.groupby(
        enumerate(xs),
        lambda i: (len(result) + i[0]) // n
    ):
        yield list(itertools.chain(result, (v[1] for v in val)))

Edit 2编辑 2

Your question has been edited so I'll address the clear points in there now:您的问题已被编辑,所以我现在将解决其中的明确要点:

If you want a duck-typed way to add to an iterable, you should probably create your own data structure.如果您想要一种鸭子类型的方式添加到可迭代对象中,您可能应该创建自己的数据结构。 That way, you can handle all iterables, and not just sets and lists.这样,您可以处理所有可迭代对象,而不仅仅是集合和列表。 If you pass a generator in, or something that's been sorted or a map result, you probably want to be able to handle those too, right?如果你传入一个生成器,或者已经sorted的东西或map结果,你可能也希望能够处理这些,对吧?

Here's an example of a wrapper for that kind of thing:这是此类事物的包装器示例:

class Appender:
  def __init__(self, iterable):
    self.length = sum(1 for _ in iterable)
    self.iterable = iter(iterable)

  def append(self, new_item):
    self.length += 1
    self.iterable = itertools.chain(self.iterable, new_item)

  def __iter__(self):
    return self.iterable

  def __len__(self):
    return self.length

Note that you can further modify this to be a MutableSequence but I don't think that's strictly necessary for your use case, where you just need length.请注意,您可以进一步将其修改为MutableSequence ,但我认为这对于您的用例来说不是绝对必要的,您只需要长度。 If you don't care about iterables, then I'd advise you change your question title to remove "or other receivers"如果您不关心可迭代对象,那么我建议您更改问题标题以删除“或其他接收者”

Also note that this doesn't handle set s like set s (obviously).另请注意,这不会像set s 那样处理set s(显然)。 I'm of the belief that it should be up to the caller to manage the output of a function.我相信应该由调用者来管理函数的输出。 I personally feel that it's perfectly acceptable to require that a function caller only pass in a MutableSequence , and responsibility of casting it to a set should be separate.我个人认为要求函数调用者只传入一个MutableSequence是完全可以接受的,并且将其转换为集合的责任应该是分开的。 This leads to clearer and more concise functions that require less logic.这导致需要更少逻辑的更清晰和更简洁的功能。 If you expect a set and/or dict is going to be a common acceptance method, it's likely worth handling that separately.如果您期望 set 和/或 dict 将成为一种常见的接受方法,则可能值得单独处理。 As was mentioned in comments to your question, these are fundamentally different data types (particularly sets which are not ordered and thus can't really be collated without first being sorted into a non-set anyway).正如在对您的问题的评论中提到的那样,这些是根本不同的数据类型(特别是未排序的集合,因此如果不首先被分类为非集合就无法真正进行整理)。

Returning to this later I found a better solution using @functools.singledispatch which is also user-extensible to additional types.稍后回到这个问题,我找到了一个更好的解决方案,使用@functools.singledispatch ,它也是用户可扩展的其他类型。

import functools

@functools.singledispatch
def append(xs, v):
    raise ValueError('append() not supported for ' + str(type(xs)))


@append.register
def _(xs: MutableSequence, v):
    xs.append(v)


@append.register
def _(xs: MutableSet, v):
    xs.add(v)

Here's the solution I ended up with...这是我最终得到的解决方案......

def appender(xs):
    if isinstance(xs, MutableSequence):
        f = xs.append
    elif isinstance(xs, MutableSet):
        f = xs.add
    # Could probably do better validation here...
    elif hasattr(xs, 'append'):  
        f = getattr(xs, 'append')
    else:
        raise ValueError('Don\'t know how to append to ' + str(type(xs)))
    return f


def collate(xs, n, f_make=lambda: list()):
    if n < 1:
        raise ValueError('n < 1')
    col = f_make()
    app = appender(col)
    for x in xs:
        if len(col) == n:
            yield col
            col = f_make()
            app = appender(col)
        app(x)
    if col:
        yield col

>>> list(collate(range(6), 4, set))
[{0, 1, 2, 3}, {4, 5}]

>>> list(collate(range(6), 4, list))
[[0, 1, 2, 3], [4, 5]]

(I previously added this to the question - and it was removed. So I'm now adding it as an answer.) (我之前将此添加到问题中 - 它已被删除。所以我现在将其添加为答案。)

Additionally, just to clarify the intended behaviour:此外,只是为了澄清预期的行为:

>>> list(collate(range(6), 2, list))
[[0, 1], [2, 3], [4, 5]]

>>> list(collate(range(6), 1, set))
[{0}, {1}, {2}, {3}, {4}, {5}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM