[英]proper using of 'self' in a python script
I'm finally creating a class to analyse my data in a more streamlined way. 我终于要创建一个类,以更简化的方式分析数据。 It takes a CSV file and outputs some information about the table and its columns.
它获取一个CSV文件,并输出有关表及其列的一些信息。
class Analyses:
def Types_des_colonnes(self, df):
tcol = df.columns.to_series().groupby(df.dtypes).groups
tycol = {k.name: v for k, v in tcol.items()}
return(self.tycol)
def Analyse_table(self, table):
# Renvoi un dico 'tycol' avec les types en clef et les noms des colonnes en valeur:
Types_des_colonnes(table)
nbr_types_colonnes_diff=len(tycol.keys())
type_table = table.dtypes
liste_columns = table.columns
clef_types= tycol.keys()
long_table = len(table)
nbr_cols = len(liste_columns)
print(table.describe())
print('Nombre de colonnes: '+ str(nbr_cols))
print('Nombre de types de colonnes différentes: '+str(nbr_types_colonnes_diff))
for kk in range(0,nbr_types_colonnes_diff):
print('Type: ' + tycol.keys()[kk])
print(tycol.values())
return(liste_columns)
def Analyse_colonne(self, col):
from numpy import where, nan
from pandas import isnull,core,DataFrame
# Si col est un dataframe:
if type(col) == core.frame.DataFrame:
dict_col = {}
for co in col.columns:
dict_col_Loc = Analyse_colonne(col[co]);
dict_col[co] = dict_col_Loc.values()
return(dict_col)
elif type(col) == core.series.Series:
type_col = type(col)
arr_null = where(isnull(col))[0]
type_data = col.dtype
col_uniq = col.unique()
nbr_unique= len(col_uniq)
taille_col= len(col)
nbr_ligne_vide= len(arr_null)
top_entree= col.head()
bottom_entree= col.tail()
pct_uniq= (float(nbr_unique)/float(taille_col))*100.0
pct_ligne_vide= (float(nbr_ligne_vide)/float(taille_col))*100.0
print('\n')
print(' ################# '+col.name+' #################')
print('Type des données: ' + str(type_data))
print('Taille de la colonne: ' + str(taille_col))
if nbr_unique == 1:
print('Aucune entrée unique')
else:
print('Nombre d\'uniques: '+ str(nbr_unique))
print('Pourcentage d\'uniques: '+str(pct_uniq)+' %')
if nbr_ligne_vide == 0:
print('Aucune ligne vide')
else:
print('Nombre de lignes vides: '+ str(nbr_ligne_vide))
print('Pourcentage de lignes vides: '+str(pct_ligne_vide)+' %')
dict_col = {}
dict_col[col.name] = arr_null
return(dict_col)
else:
print('Problem')
def main():
anly = Analyses()
anly.Analyse_table(df_AIS)
if __name__ == '__main__':
main()
When I run this script, I get a: 运行此脚本时,我得到:
NameError: name 'tycol' is not defined
Which refers to the second line of: 指的是第二行:
def Analyse_table():
# Renvoi un dico 'tycol' avec les types en clef et les noms des colonnes en valeur:
Types_des_colonnes(table)
nbr_types_colonnes_diff=len(tycol.keys())
I know it has to do with using the 'self' properly, but I really don't understand how to do so properly. 我知道这与正确使用“自我”有关,但我真的不明白如何正确使用“自我”。 Could anybody show me how to solve this very easy problem?
谁能告诉我如何解决这个非常简单的问题?
(All the 'self' present in this script have been added by me only to try to make it work on my own.) (我添加了此脚本中存在的所有“自我”,只是试图使它自己运行。)
The members of a Python object are distinguished from other variables by being on the right hand side of .
Python对象的成员通过位于的右侧与其他变量区分开
.
(as in obj.member
) (与
obj.member
)
The first parameter of a method is bound to the object on which the method is called. 方法的第一个参数绑定到在其上调用该方法的对象。 By convention, this parameter is named
self
, this is not a technical requirement. 按照惯例,此参数称为
self
,这不是技术要求。
tycol
is a normal variable, entirely unassociated with the Analyses
object. tycol
是一个正常的变量,用完全非关联Analyses
对象。 self.tycol
is a different name. self.tycol
是一个不同的名称。
Notice how you return self.tycol
from Types_des_colonnes
, without giving it any value (which should raise an AttributeError
. Have you tried running the code as you posted it in the question body?). 请注意,如何从
Types_des_colonnes
return self.tycol
而不给它任何值(这应该引发AttributeError
。您是否尝试过将代码发布到问题正文中时运行代码?)。 You then discard this value at the call site. 然后,您可以在呼叫站点放弃该值。
You should either assign the result of Types_des_colonnes
to a name in Analyse_table
, or exclusively use the name self.tycol
. 您应该的结果分配
Types_des_colonnes
一个名字在Analyse_table
,或只使用名称self.tycol
。
def Types_des_colonnes(self, df):
tcol = df.columns.to_series().groupby(df.dtypes).groups
# we don't care about tcol after this, it ceases to exist when the method ends
self.tycol = {k.name: v for k, v in tcol.items()}
# but we do care about self.tycol
def Analyse_table(self, table):
# Renvoi un dico 'tycol' avec les types en clef et les noms des colonnes en valeur:
Types_des_colonnes(table)
nbr_types_colonnes_diff = len(self.tycol.keys())
# ...
In method Types_de_colonnes
, you need to do: self.tycol=tycol
. 在
Types_de_colonnes
方法中,您需要做: self.tycol=tycol
。 Also, you need to call the method "as a method". 另外,您需要将方法称为“作为方法”。 Take a week to read a book about python to learn some basics.
花一个星期阅读一本有关python的书,以学习一些基础知识。 Programming is easy, but not that easy :)
编程很容易,但并不容易:)
A class is a data structure that contains "data and the methods that operate on that data". 类是包含“数据和对该数据进行操作的方法”的数据结构。 Note, that I did not say 'functions' because a class always has access to data contained within the class, and therefore the methods in the class are not 'functions' in a mathematical sense.
请注意,我之所以没有说“函数”,是因为一个类始终可以访问该类中包含的数据,因此从数学意义上讲,该类中的方法不是“函数”。 But, That's for another day, perhaps.
但是,也许是另一天。
So, when do you use self
? 那么,您何时使用
self
? self
represents the actual instance of the class that you are invoking the method within. self
表示您要在其中调用方法的类的实际实例。 So if you have a class called Shape
and two instances of Shape
a
and b
then when you call a.area()
the self
object inside the area
method refers to the instance of Shape
named a
, where when you invoke b.area()
the self
object refers to the b
instance of Shape
因此,如果您有一个名为
Shape
的类以及Shape
a
和b
两个实例,则在调用a.area()
, area
方法内的self
对象将引用名为a
的Shape
实例,在调用b.area()
self
对象是Shape
的b
实例
In this way you can write a method that works for any instance of Shape
. 这样,您可以编写适用于
Shape
任何实例的方法。 To make this more concrete, here's an example Shape
class: 为了更加具体,这是一个
Shape
类示例:
class Shape():
def __init__(self, length_in, height_in):
self.length = length_in
self.height = height_in
def area(self):
return self.length * self.height
Here you can see that the data contained within the Shape
class is length and height. 在这里,您可以看到
Shape
类中包含的数据是长度和高度。 Those values are assigned at the __init__
(in the constructor, ie. Shape a(3.0,4.0)
) And are assigned as members of self
. 这些值在
__init__
(在构造函数中,即Shape a(3.0,4.0)
)处分配,并作为self
成员分配。 Then, afterword they can be accessed by the method area
though the self
object, for calculations. 然后,可以通过
self
对象通过方法area
访问后记,以进行计算。 These members can also be reassigned, and new members can be created. 也可以重新分配这些成员,并可以创建新成员。 (Generally though members are only created in the constructor).
(尽管通常只在构造函数中创建成员)。
This is all very weird compared to the other simple aspects of Python design. 与Python设计的其他简单方面相比,这一切都很奇怪。 Yet, this is not unique to Python.
但是,这并不是Python独有的。 In C++ there is a
this
pointer, that serves the same purpose, and in JavaScript the way that closures are used to create objects often uses a this
variable to perform the same task as Python's self
. 在C ++中,有一个
this
指针,其作用相同,在JavaScript中,闭包用于创建对象的方式通常使用this
变量来执行与Python的self
相同的任务。
I hope this helps a little. 希望这会有帮助。 I can expand on any other questions you have.
我可以谈谈您还有其他任何问题。
Also, it's generally a good idea to do import
statements at the top of the file. 同样,在文件顶部进行
import
语句通常也是一个好主意。 There are reasons not to, but none of them are good enough for normal coders to use. 有一些原因,但没有一个足以让普通编码人员使用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.