简体   繁体   English

在 python 数据类 __init__ 方法中强制进行类型转换

[英]Force type conversion in python dataclass __init__ method

I have the following very simple dataclass:我有以下非常简单的数据类:

import dataclasses

@dataclasses.dataclass
class Test:
    value: int

I create an instance of the class but instead of an integer I use a string:我创建了一个 class 的实例,但我使用了一个字符串,而不是 integer:

>>> test = Test('1')
>>> type(test.value)
<class 'str'>

What I actually want is a forced conversion to the datatype i defined in the class defintion:我真正想要的是强制转换为我在 class 定义中定义的数据类型:

>>> test = Test('1')
>>> type(test.value)
<class 'int'>

Do I have to write the __init__ method manually or is there a simple way to achieve this?我必须手动编写__init__方法还是有一种简单的方法来实现这一点?

The type hint of dataclass attributes is never obeyed in the sense that types are enforced or checked.从强制或检查类型的意义上说,从不遵守数据类属性的类型提示。 Mostly static type checkers like mypy are expected to do this job, Python won't do it at runtime, as it never does.大多数情况下,像mypy这样的 static 类型检查器预计会完成这项工作,Python 在运行时不会这样做,因为它从来不会这样做。

If you want to add manual type checking code, do so in the __post_init__ method:如果要添加手动类型检查代码,请在__post_init__方法中执行此操作:

@dataclasses.dataclass
class Test:
    value: int

    def __post_init__(self):
        if not isinstance(self.value, int):
            raise ValueError('value not an int')
            # or self.value = int(self.value)

You could use dataclasses.fields(self) to get a tuple of Field objects which specify the field and the type and loop over that to do this for each field automatically, without writing it for each one individually.您可以使用dataclasses.fields(self)来获取Field对象的元组,这些对象指定字段和类型并循环遍历该对象以自动为每个字段执行此操作,而无需为每个字段单独编写。

def __post_init__(self):
    for field in dataclasses.fields(self):
        value = getattr(self, field.name)
        if not isinstance(value, field.type):
            raise ValueError(f'Expected {field.name} to be {field.type}, '
                             f'got {repr(value)}')
            # or setattr(self, field.name, field.type(value))

It's easy to achieve by usingpydantic.validate_arguments使用pydantic.validate_arguments很容易实现

Just use the validate_arguments decorator in your dataclass:只需在数据类中使用validate_arguments装饰器:

from dataclasses import dataclass
from pydantic import validate_arguments


@validate_arguments
@dataclass
class Test:
    value: int

Then try your demo, the 'str type' 1 will convert from str to int然后尝试你的演示,'str type' 1 将从str转换为int

>>> test = Test('1')
>>> type(test.value)
<class 'int'>

If you pass the truly wrong type, it will raise exception如果您传递了真正错误的类型,它将引发异常

>>> test = Test('apple')
Traceback (most recent call last):
...
pydantic.error_wrappers.ValidationError: 1 validation error for Test
value
  value is not a valid integer (type=type_error.integer)

You could achieve this using the __post_init__ method:您可以使用__post_init__方法实现此目的:

import dataclasses

@dataclasses.dataclass
class Test:
    value : int

    def __post_init__(self):
        self.value = int(self.value)

This method is called following the __init__ method此方法在__init__方法之后调用

https://docs.python.org/3/library/dataclasses.html#post-init-processing https://docs.python.org/3/library/dataclasses.html#post-init-processing

Yeah, the easy answer is to just do the conversion yourself in your own __init__() .是的,简单的答案是自己在自己的__init__()中进行转换。 I do this because I want my objects frozen=True .我这样做是因为我希望我的对象frozen=True

For the type validation, Pydandic claims to do it, but I haven't tried it yet: https://pydantic-docs.helpmanual.io/对于类型验证,Pydandic 声称可以这样做,但我还没有尝试过: https://pydantic-docs.helpmanual.io/

With Python dataclasses , the alternative is to use the __post_init__ method, as pointed out in other answers:使用 Python dataclasses ,替代方法是使用__post_init__方法,如其他答案中所指出的:

@dataclasses.dataclass
class Test:
    value: int

    def __post_init__(self):
        self.value = int(self.value)
>>> test = Test("42")
>>> type(test.value)
<class 'int'>

Or you can use the attrs package, which allows you to easily set converters :或者您可以使用attrs package,它可以让您轻松设置转换器

@attr.define
class Test:
    value: int = attr.field(converter=int)
>>> test = Test("42")
>>> type(test.value)
<class 'int'>

You can use the cattrs package, that does conversion based on the type annotations in attr classes and dataclasses , if your data comes from a mapping instead:如果您的数据来自映射,则可以使用cattrs package,它基于attr classes 和dataclasses 中的类型注释进行转换:

@dataclasses.dataclass
class Test:
    value: int
>>> test = cattrs.structure({"value": "42"}, Test)
>>> type(test.value)
<class 'int'>

You could use descriptor-typed field:您可以使用描述符类型的字段:

class IntConversionDescriptor:

    def __set_name__(self, owner, name):
        self._name = "_" + name

    def __get__(self, instance, owner):
        return getattr(instance, self._name)

    def __set__(self, instance, value):
        setattr(instance, self._name, int(value))


@dataclass
class Test:
    value: IntConversionDescriptor = IntConversionDescriptor()
>>> test = Test(value=1)
>>> type(test.value)
<class 'int'>

>>> test = Test(value="12")
>>> type(test.value)
<class 'int'>

test.value = "145"
>>> type(test.value)
<class 'int'>

test.value = 45.12
>>> type(test.value)
<class 'int'>

You could use a generic type-conversion descriptor , declared in descriptors.py :您可以使用在descriptors.py中声明的通用类型转换描述descriptors.py

import sys


class TypeConv:

    __slots__ = (
        '_name',
        '_default_factory',
    )

    def __init__(self, default_factory=None):
        self._default_factory = default_factory

    def __set_name__(self, owner, name):
        self._name = "_" + name
        if self._default_factory is None:
            # determine default factory from the type annotation
            tp = owner.__annotations__[name]
            if isinstance(tp, str):
                # evaluate the forward reference
                base_globals = getattr(sys.modules.get(owner.__module__, None), '__dict__', {})
                idx_pipe = tp.find('|')
                if idx_pipe != -1:
                    tp = tp[:idx_pipe].rstrip()
                tp = eval(tp, base_globals)
            # use `__args__` to handle `Union` types
            self._default_factory = getattr(tp, '__args__', [tp])[0]

    def __get__(self, instance, owner):
        return getattr(instance, self._name)

    def __set__(self, instance, value):
        setattr(instance, self._name, self._default_factory(value))

Usage in main.py would be like: main.py中的用法如下:

from __future__ import annotations
from dataclasses import dataclass
from descriptors import TypeConv


@dataclass
class Test:
    value: int | str = TypeConv()


test = Test(value=1)
print(test)

test = Test(value='12')
print(test)

# watch out: the following assignment raises a `ValueError`
try:
    test.value = '3.21'
except ValueError as e:
    print(e)

Output: Output:

Test(value=1)
Test(value=12)
invalid literal for int() with base 10: '3.21'

Note that while this does work for other simple types, it does not handle conversions for certain types - such as bool or datetime - as normally expected.请注意,虽然这确实适用于其他简单类型,但它不会像通常预期的那样处理某些类型(例如booldatetime )的转换。

If you are OK with using third-party libraries for this, I have come up with a (de)serialization library called the dataclass-wizard that can perform type conversion as needed, but only when fromdict() is called:如果您愿意为此使用第三方库,我已经提出了一个名为dataclass-wizard的(反)序列化库,它可以根据需要执行类型转换,但仅在fromdict()时:

from __future__ import annotations
from dataclasses import dataclass

from dataclass_wizard import JSONWizard


@dataclass
class Test(JSONWizard):
    value: int
    is_active: bool


test = Test.from_dict({'value': '123', 'is_active': 'no'})
print(repr(test))

assert test.value == 123
assert not test.is_active

test = Test.from_dict({'is_active': 'tRuE', 'value': '3.21'})
print(repr(test))

assert test.value == 3
assert test.is_active

Why not use setattr ?为什么不使用setattr

from dataclasses import dataclass, fields

@dataclass()
class Test:
    value: int

    def __post_init__(self):
        for field in fields(self):
            setattr(self, field.name, field.type(getattr(self, field.name)))

Which yields the required result:产生所需的结果:

>>> test = Test('1')
>>> type(test.value)
<class 'int'>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM