简体   繁体   English

如何从另一个命名元组推导或子类型命名元组?

[英]How To Deduce Or Subtype Named Tuple From Another Named Tuple?

Preface前言

I was wondering how to conceptualize data classes in a pythonic way.我想知道如何以pythonic方式概念化数据类。 Specifically I'm talking about DTO ( Data Transfer Object .)具体来说,我在谈论 DTO(数据传输对象)。

I found a good answer in @jeff-oneill question “ Using Python class as a data container ” where @joe-kington had a good point to use built-in namedtuple .我在 @jeff-oneill 的问题“ 使用 Python 类作为数据容器”中找到了一个很好的答案,其中 @joe-kington 很好地使用了内置的namedtuple

Question

In section 8.3.4 of python 2.7 documentation there is good example on how to combine several named tuples.在 python 2.7 文档的第 8.3.4 节中,有一个关于如何组合多个命名元组的很好的示例 My question is how to achieve the reverse?我的问题是如何实现反转?

Example例子

Considering the example from documentation:考虑文档中的示例:

>>> p._fields            # view the field names
('x', 'y')

>>> Color = namedtuple('Color', 'red green blue')
>>> Pixel = namedtuple('Pixel', Point._fields + Color._fields)
>>> Pixel(11, 22, 128, 255, 0)
Pixel(x=11, y=22, red=128, green=255, blue=0)

How can I deduce a “Color” or a “Point” instance from a “Pixel” instance?我如何从“像素”实例中推断出“颜色”或“点”实例?

Preferably in pythonic spirit.最好是pythonic精神。

Here it is.这里是。 By the way, if you need this operation often, you may create a function for color_ins creation, based on pixel_ins .顺便说一下,如果你经常需要这个操作,你可以基于color_ins创建一个用于创建pixel_ins的函数。 Or even for any subnamedtuple!甚至对于任何子命名元组!

from collections import namedtuple

Point = namedtuple('Point', 'x y')
Color = namedtuple('Color', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields)

pixel_ins = Pixel(x=11, y=22, red=128, green=255, blue=0)
color_ins = Color._make(getattr(pixel_ins, field) for field in Color._fields)

print color_ins

Output: Color(red=128, green=255, blue=0)输出: Color(red=128, green=255, blue=0)

Function for extracting arbitrary subnamedtuple (without error handling):提取任意子命名元组的函数(无错误处理):

def extract_sub_namedtuple(parent_ins, child_cls):
    return child_cls._make(getattr(parent_ins, field) for field in child_cls._fields)

color_ins = extract_sub_namedtuple(pixel_ins, Color)
point_ins = extract_sub_namedtuple(pixel_ins, Point)

Point._fields + Color._fields is simply a tuple. Point._fields + Color._fields只是一个元组。 So given this:所以鉴于此:

from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
Color = namedtuple('Color', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields)

f = Point._fields + Color._fields

type(f) is just tuple . type(f)只是tuple Therefore, there is no way to know where it came from.因此,没有办法知道它来自哪里。

I recommend that you look into attrs for easily doing property objects.我建议您查看attrs以轻松处理属性对象。 This will allow you to do proper inheritance and avoid the overheads of defining all the nice methods to access fields.这将允许您进行适当的继承并避免定义所有访问字段的好方法的开销。

So you can do所以你可以做

import attr

@attr.s
class Point:
    x, y = attr.ib(), attr.ib()

@attr.s
class Color:
    red, green, blue = attr.ib(), attr.ib(), attr.ib()

class Pixel(Point, Color):
    pass

Now, Pixel.__bases__ will give you (__main__.Point, __main__.Color) .现在, Pixel.__bases__会给你(__main__.Point, __main__.Color)

Here's an alternative implementation of Nikolay Prokopyev's extract_sub_namedtuple that uses a dictionary instead of getattr .这是 Nikolay Prokopyev 的extract_sub_namedtuple的替代实现,它使用字典而不是getattr

from collections import namedtuple

Point = namedtuple('Point', 'x y')
Color = namedtuple('Color', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields)

def extract_sub_namedtuple(tup, subtype):
    d = tup._asdict()
    return subtype(**{k:d[k] for k in subtype._fields})

pix = Pixel(11, 22, 128, 255, 0)

point = extract_sub_namedtuple(pix, Point)
color = extract_sub_namedtuple(pix, Color)
print(point, color)

output输出

Point(x=11, y=22) Color(red=128, green=255, blue=0)

This could be written as a one-liner:可以写成一行:

def extract_sub_namedtuple(tup, subtype):
    return subtype(**{k:tup._asdict()[k] for k in subtype._fields})

but it's less efficient because it has to call tup._asdict() for each field in subtype._fields .但它的效率较低,因为它必须为subtype._fields中的每个字段调用tup._asdict()

Of course, for these specific namedtuples, you can just do当然,对于这些特定的命名元组,你可以这样做

point = Point(*pix[:2])
color = Color(*pix[2:])

but that's not very elegant because it hard-codes the parent field positions and lengths.但这不是很优雅,因为它对父字段的位置和长度进行了硬编码。

FWIW, there's code to combine multiple namedtuples into one namedtuple, preserving field order and skipping duplicate fields in this answer . FWIW,有代码将多个命名元组组合成一个命名元组,保留字段顺序并跳过此答案中的重复字段。

Another way you could do this is to make the arguments for "Pixel" align with what you actually want instead of flattening all of the arguments for its constituent parts.您可以这样做的另一种方法是使“像素”的参数与您实际想要的一致,而不是展平其组成部分的所有参数。

Instead of combining Point._fields + Color._fields to get the fields for Pixel, I think you should just have two parameters: location and color .与其组合Point._fields + Color._fields来获取 Pixel 的字段,我认为你应该只有两个参数: locationcolor These two fields could be initialized with your other tuples and you wouldn't have to do any inference.这两个字段可以用您的其他元组初始化,您不必进行任何推断。

For example:例如:

# Instead of Pixel(x=11, y=22, red=128, green=255, blue=0)
pixel_ins = Pixel(Point(x=11, y=22), Color(red=128, green=255, blue=0))

# Get the named tuples that the pixel is parameterized by
pixel_color = pixel_ins.color
pixel_point = pixel_ins.location

By mashing all the parameters together (eg x, y, red, green, and blue all on the main object) you don't really gain anything, but you lose a lot of legibility.通过将所有参数混合在一起(例如,主要对象上的 x、y、红色、绿色和蓝色),您不会真正获得任何东西,但会失去很多可读性。 Flattening the parameters also introduces a bug if your namedtuple parameters share fields:如果您的 namedtuple 参数共享字段,则展平参数也会引入错误:

from collections import namedtuple 

Point = namedtuple('Point', ['x', 'y'])
Color = namedtuple('Color', 'red green blue')
Hue = namedtuple('Hue', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields + Hue._fields)
# Results in:
#    Traceback (most recent call last):
#      File "<stdin>", line 1, in <module>
#      File "C:\Program Files\Python38\lib\collections\__init__.py", line 370, in namedtuple
#        raise ValueError(f'Encountered duplicate field name: {name!r}')
#    ValueError: Encountered duplicate field name: 'red'

  

Background背景

Originally I've asked this question because I had to support some spaghetti codebase that used tuples a lot but not giving any explanation about the values inside them.最初我问这个问题是因为我不得不支持一些大量使用元组但没有对其中的值给出任何解释的意大利面条代码库。 After some refactoring, I noticed that I need to extract some typed information from other tuples and was looking for some boilerplate free and type-safe way of doing it.经过一些重构后,我注意到我需要从其他元组中提取一些类型信息,并且正在寻找一些无样板且类型安全的方法来执行此操作。

Solution解决方案

You can subclass named tuple definition and implement a custom __new__ method to support that, optionally carrying out some data formatting and validation on the way.您可以将命名的元组定义子类化并实现自定义__new__方法来支持它,可选择地执行一些数据格式化和验证。 See this reference for more details.有关详细信息,请参阅此参考资料

Example例子

from __future__ import annotations

from collections import namedtuple
from typing import Union, Tuple

Point = namedtuple('Point', 'x y')
Color = namedtuple('Color', 'red green blue')
Pixel = namedtuple('Pixel', Point._fields + Color._fields)

# Redeclare "Color" to provide custom creation method
# that can deduce values from various different types
class Color(Color):

    def __new__(cls, *subject: Union[Pixel, Color, Tuple[float, float, float]]) -> Color:
        # If got only one argument either of type "Pixel" or "Color"
        if len(subject) == 1 and isinstance((it := subject[0]), (Pixel, Color)):
            # Create from invalidated color properties
            return super().__new__(cls, *cls.invalidate(it.red, it.green, it.blue))
        else:  # Else treat it as raw values and by-pass them after invalidation
            return super().__new__(cls, *cls.invalidate(*subject))

    @classmethod
    def invalidate(cls, r, g, b) -> Tuple[float, float, float]:
        # Convert values to float
        r, g, b = (float(it) for it in (r, g, b))
        # Ensure that all values are in valid range
        assert all(0 <= it <= 1.0 for it in (r, g, b)), 'Some RGB values are invalid'
        return r, g, b

Now you can instantiate Color from any of the supported value types ( Color , Pixel , a triplet of numbers) without boilerplate.现在,您可以从任何受支持的值类型( ColorPixel 、数字的三元组)实例化Color ,而无需样板。

color = Color(0, 0.5, 1)
from_color = Color(color)
from_pixel = Color(Pixel(3.4, 5.6, 0, 0.5, 1))

And you can verify all are equal values:您可以验证所有值是否相等:

>>> (0.0, 0.5, 1.0) == color == from_color == from_pixel
True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM