简体   繁体   English

如何决定何时引入新类型而不是使用列表或元组?

[英]How to decide when to introduce a new type instead of using list or tuple?

I like to do some silly stuff with python like solving programming puzzles, writing small scripts etc. Each time at a certain point I'm facing a dilemma whether I should create a new class to represent my data or just use quick and dirty and go with all values packed in a list or tuple. 我喜欢用python做一些愚蠢的事情,比如解决编程难题,编写小脚本等等。每次在某个时刻,我都面临着一个两难的境地,我是否应该创建一个新的类来表示我的数据,或者只是使用快速和肮脏的东西去将所有值打包在列表或元组中。 Due to extreme laziness and personal dislike of self keyword I usually go with the second option. 由于极端的懒惰和个人对self关键词的厌恶,我通常会选择第二种选择。

I understand than in the long run user defined data type is better because path.min_cost and point.x, point.y is much more expressive than path[2] and point[0], point[1] . 我理解,从长远来看,用户定义的数据类型更好,因为path.min_costpoint.x, point.ypath[2]point[0], point[1]更具表现力。 But when I just need to return multiple things from a function it strikes me as too much work. 但是当我只需要从一个函数中返回多个东西时,它就会让我觉得太多了。

So my question is what is the good rule of thumb for choosing when to create user defined data type and when to go with a list or tuple? 所以我的问题是,选择何时创建用户定义的数据类型以及何时使用列表或元组有什么好的经验法则? Or maybe there is a neat pythonic way I'm not aware of? 或者也许有一种我不知道的整齐的pythonic方式?

Thanks. 谢谢。

Are you aware of collections.namedtuple ? 你知道collections.namedtuple吗? ( since 2.6 ) 自2.6起

def getLocation(stuff):
    return collections.namedtuple('Point', 'x, y')(x, y)

or, more efficiently, 或者,更有效率,

Point = collections.namedtuple('Point', 'x, y')
def getLocation(stuff):
    return Point(x, y)

namedtuple can be accessed by index ( point[0] ) and unpacked ( x, y = point ) the same way as tuple , so it offers a nearly painless upgrade path. 可以通过索引( point[0] )访问namedtuple并以与tuple相同的方式解压缩( x, y = point ),因此它提供了几乎无痛的升级路径。

This is certainly subjective, but I would try to observe the principle of least surprise. 这当然是主观的,但我会尽量遵守最不惊讶的原则。

If the values you return describe the characteristics of an object (like point.x and point.y in your example), then I would use a class. 如果返回的值描述了对象的特征(如示例中的point.xpoint.y ),那么我将使用一个类。

If they are not part of the same object, (let's say return min, max ) then they should be a tuple. 如果它们不属于同一个对象(让我们说return min, max )那么它们应该是一个元组。

First, an observation about expressivity. 首先,关于表现力的观察。 You mentioned being concerned about the relative expressivity of point.x , point.y vs. point[0], point[1] , but this is a problem that can be solved in more than one way. 你提到关注point.xpoint.ypoint[0], point[1]的相对表现力,但这是一个可以通过多种方式解决的问题。 In fact, for a simple point structure, I think there's an argument to be made that a class is overkill, especially when you could just do this: 事实上,对于一个简单的point结构,我认为有一个论点是一个类是矫枉过正的,特别是当你可以这样做时:

x, y = get_point(foo)

I would say this is just about as expressive as point.x , point.y ; 我会说这与point.xpoint.y一样具有表现point.y it's also likely to be faster (than a vanilla class, anyway -- no __dict__ lookups) and it's quite readable, assuming the tuple contains just a few items. 它也可能更快(比一个vanilla类,无论如何 - 没有__dict__查找)并且它非常易读,假设元组只包含几个项目。

My approach to deciding whether to put something in a class has more to do with the way I'll use the data in the program as a whole: I ask myself "is this state ?" 我决定是否在课堂上放置某些东西的方法更多地与我在整个程序中使用数据的方式有关:我问自己“这是这个状态吗?” If I have some data that I know will change a lot, and needs to be stored in one place and manipulated by a group of purpose-built functions, then I know that data is probably state, and I should at least consider putting it in a class. 如果我知道一些数据会发生很大变化,并且需要存储在一个地方并由一组专用函数操作,那么我知道数据可能是状态,我至少应该考虑将其放入一类。 On the other hand, if I have some data that won't change, or is ephemeral and should disappear once I'm done with it, it's probably not state, and probably doesn't need to go into a class. 另一方面,如果我有一些数据不会改变,或者是短暂的,并且一旦我完成它就会消失,它可能不是状态,并且可能不需要进入类。

This is, of course, just a rule of thumb; 当然,这只是一个经验法则; for example, I can think of cases where you might need some kind of "record" type so that you can manipulate a pretty complex collection of data without having 15 different local variables (hence the existence of namdetuple ). 例如,我可以想到你可能需要某种“记录”类型的情况,这样你就可以操作一个非常复杂的数据集合而不需要15个不同的局部变量(因此存在namdetuple )。 But often, if you're manipulating just one or two of them, you'll be better off creating a function that just accepts one or two values and returns one or two values, and for that, a tuple or list is perfectly fine. 但通常情况下,如果你只操作其中的一个或两个,你最好创建一个只接受一个或两个值并返回一个或两个值的函数,为此,一个元组或列表完全没问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM