使用Python数据类时，哪里是处理数据以初始化数据类的正确位置

Question

In Python, 在Python中，

I am using a dataclass named "MyDataClass" to store data returned by a http response. 我正在使用一个名为“ MyDataClass”的数据类来存储http响应返回的数据。 let's say the response content is a json like this and I need only the first two fields : 假设响应内容是这样的json，而我只需要前两个字段 ：

{
    "name": "Test1",
    "duration": 4321,
    "dont_care": "some_data",
    "dont_need": "some_more_data"
}

and now I have two options: 现在我有两个选择：

Option 1 选项1

resp: dict = The response's content as json
my_data_class: MyDataClass(name=resp['name'], duration=resp['duration'])

where I take advantage of the dataclass' automatically defined init method 我利用dataclass的自动定义的init方法

or 要么

Option 2 选项2

resp: dict = The response's content as json
my_data_class: MyDataClass(resp)

and leave the processing to the dataclass init method, like this: 并将处理留给dataclass的init方法，如下所示：

def _ _ init _ _(self, resp: Response) -> None:
    self.name: str = resp['name']
    self.duration: int = resp['duration']

I prefer the 2nd option, but I would like to know if there is a right way to this. 我更喜欢第二种选择，但我想知道是否有正确的方法。

Thanks. 谢谢。

Answer 1

You only need the 1st 2 fields for now . 你只需要对于现在的第一个2场。 Until you actually end up needing more. 直到您最终需要更多。 IMO it'll be way easier to go to the Dataclass's _ _init _ _() method to take care of that. IMO可以更轻松地使用Dataclass的_ _init _ _（）方法来解决此问题。 Otherwise you would have to change BOTH the function call (MyDataClass(name=.....)) AND the dataclass init. 否则，您将必须同时更改函数调用（MyDataClass（name = .....））和数据类init。 With the 2nd option you have only one place where you need to intervene. 使用第二个选项，您只有一个地方需要干预。

Unless don't care/don't need is huge and you're taking performance hit because of that... premature optimization is the root of all evils. 除非不关心/不需要，这是巨大的，并且因此而使性能受到冲击……过早的优化是万恶之源。 So keep it simple & flexible as long as you can! 因此，请尽可能保持其简单和灵活！

Answer 2

Let's say in future, you want to extract more data from response and store it in Dataclass , in OPTION 1: you would need to increase the arguments for __init__ method as well as all place where you initialized Dataclass . 假设在将来，您想从response提取更多数据并将其存储在Dataclass ，在选项1中：您将需要增加__init__方法的参数以及初始化Dataclass所有位置。 Therefore, OPTION 2 is preferable since it reduces code redundancy and keeps data extraction logic in one place. 因此，选项2是可取的，因为它减少了代码冗余并将数据提取逻辑保持在一个位置。

Answer 3

You should absolutely try to avoid overwriting a dataclass' __init__ function. 您应该绝对避免覆盖数据类的__init__函数。 There is quite a bit of magic that you'll just lose by overwriting it. 有很多魔术会因覆盖而丢失。 Among other things, you won't be able to have a proper __post_init__ function call, unless you rewrite it yourself. 除其他事项外，您将无法进行正确的__post_init__函数调用，除非您自己重写它。 Which is not trivial. 这不是小事。

The reason why dataclass works this way is because it is supposed to be a very simple one-to-one mapping of your business data into a programmatic structure. dataclass这种方式工作的原因是，它被认为是业务数据到程序结构的非常简单的一对一映射。 As a consequence, every kind of additional logic that you add which has nothing to do with that core idea takes away from the usefulness of dataclass . 结果，您添加的每种与该核心思想无关的附加逻辑都失去了dataclass的用处。

So I'd suggest to stick to option 1. 因此，我建议您坚持选择1。

If writing out the wanted attributes by hand becomes too much of a nuisance, you can consider writing a classmethod that filters unwanted attributes for you, and allows you to just splat the dictionary like this: 如果手动写出想要的属性变得很麻烦，则可以考虑编写一个类方法，该方法为您过滤掉不需要的属性，并允许您像下面这样喷洒字典：

dataclass_instance = MyDataClass.from_request(**resp)

Here is a post that explains how to do just that, where the accompanying question also touches on some of your issues. 这是一篇说明如何执行此操作的帖子，随附的问题也涉及您的某些问题。

使用Python数据类时，哪里是处理数据以初始化数据类的正确位置

问题描述

3 个解决方案

解决方案1
1 2019-07-28 13:42:43

解决方案2
1 2019-07-28 13:52:06

解决方案3
0 2019-08-02 09:47:36

使用Python数据类时，哪里是处理数据以初始化数据类的正确位置

问题描述

3 个解决方案

解决方案1 1 2019-07-28 13:42:43

解决方案2 1 2019-07-28 13:52:06

解决方案3 0 2019-08-02 09:47:36

解决方案1
1 2019-07-28 13:42:43

解决方案2
1 2019-07-28 13:52:06

解决方案3
0 2019-08-02 09:47:36