简体   繁体   English

在python脚本中正确使用'self'

[英]proper using of 'self' in a python script

I'm finally creating a class to analyse my data in a more streamlined way. 我终于要创建一个类,以更简化的方式分析数据。 It takes a CSV file and outputs some information about the table and its columns. 它获取一个CSV文件,并输出有关表及其列的一些信息。

class Analyses:
    def Types_des_colonnes(self, df):
        tcol = df.columns.to_series().groupby(df.dtypes).groups
        tycol = {k.name: v for k, v in tcol.items()}
        return(self.tycol)

    def Analyse_table(self, table):
        # Renvoi un dico 'tycol' avec les types en clef et les noms des colonnes en valeur:
        Types_des_colonnes(table)
        nbr_types_colonnes_diff=len(tycol.keys())


        type_table = table.dtypes
        liste_columns = table.columns
        clef_types= tycol.keys()
        long_table = len(table)
        nbr_cols = len(liste_columns)

        print(table.describe())

        print('Nombre de colonnes: '+ str(nbr_cols))
        print('Nombre de types de colonnes différentes: '+str(nbr_types_colonnes_diff))
        for kk in range(0,nbr_types_colonnes_diff):
            print('Type: ' + tycol.keys()[kk])
            print(tycol.values())
        return(liste_columns)

    def Analyse_colonne(self, col):
        from numpy import where, nan
        from pandas import isnull,core,DataFrame
        # Si col est un dataframe:
        if type(col) == core.frame.DataFrame:
            dict_col = {}
            for co in col.columns:
                dict_col_Loc = Analyse_colonne(col[co]);
                dict_col[co] = dict_col_Loc.values()
            return(dict_col)
        elif type(col) == core.series.Series:    
            type_col = type(col)
            arr_null = where(isnull(col))[0]
            type_data = col.dtype
            col_uniq = col.unique()

            nbr_unique= len(col_uniq)
            taille_col= len(col)
            nbr_ligne_vide= len(arr_null)

            top_entree= col.head()
            bottom_entree= col.tail()
            pct_uniq= (float(nbr_unique)/float(taille_col))*100.0
            pct_ligne_vide= (float(nbr_ligne_vide)/float(taille_col))*100.0
            print('\n')
            print('       #################      '+col.name+'      #################')
            print('Type des données: ' + str(type_data))
            print('Taille de la colonne: ' + str(taille_col))
            if nbr_unique == 1:
                print('Aucune entrée unique')
            else:
                print('Nombre d\'uniques: '+ str(nbr_unique))
                print('Pourcentage d\'uniques: '+str(pct_uniq)+' %')
            if nbr_ligne_vide == 0:
                print('Aucune ligne vide')
            else:
                print('Nombre de lignes vides: '+ str(nbr_ligne_vide))
                print('Pourcentage de lignes vides: '+str(pct_ligne_vide)+' %')

            dict_col = {}
            dict_col[col.name] = arr_null
            return(dict_col)
        else:
            print('Problem')

def main():
    anly = Analyses()
    anly.Analyse_table(df_AIS)

if __name__ == '__main__':
    main()

When I run this script, I get a: 运行此脚本时,我得到:

NameError: name 'tycol' is not defined

Which refers to the second line of: 指的是第二行:

def Analyse_table():
        # Renvoi un dico 'tycol' avec les types en clef et les noms des colonnes en valeur:
        Types_des_colonnes(table)
        nbr_types_colonnes_diff=len(tycol.keys())

I know it has to do with using the 'self' properly, but I really don't understand how to do so properly. 我知道这与正确使用“自我”有关,但我真的不明白如何正确使用“自我”。 Could anybody show me how to solve this very easy problem? 谁能告诉我如何解决这个非常简单的问题?

(All the 'self' present in this script have been added by me only to try to make it work on my own.) (我添加了此脚本中存在的所有“自我”,只是试图使它自己运行。)

The members of a Python object are distinguished from other variables by being on the right hand side of . Python对象的成员通过位于的右侧与其他变量区分开. (as in obj.member ) (与obj.member

The first parameter of a method is bound to the object on which the method is called. 方法的第一个参数绑定到在其上调用该方法的对象。 By convention, this parameter is named self , this is not a technical requirement. 按照惯例,此参数称为self ,这不是技术要求。

tycol is a normal variable, entirely unassociated with the Analyses object. tycol是一个正常的变量,用完全非关联Analyses对象。 self.tycol is a different name. self.tycol是一个不同的名称。

Notice how you return self.tycol from Types_des_colonnes , without giving it any value (which should raise an AttributeError . Have you tried running the code as you posted it in the question body?). 请注意,如何从Types_des_colonnes return self.tycol而不给它任何值(这应该引发AttributeError 。您是否尝试过将代码发布到问题正文中时运行代码?)。 You then discard this value at the call site. 然后,您可以在呼叫站点放弃该值。

You should either assign the result of Types_des_colonnes to a name in Analyse_table , or exclusively use the name self.tycol . 您应该的结果分配Types_des_colonnes一个名字在Analyse_table ,或只使用名称self.tycol

def Types_des_colonnes(self, df):
    tcol = df.columns.to_series().groupby(df.dtypes).groups
        # we don't care about tcol after this, it ceases to exist when the method ends
    self.tycol = {k.name: v for k, v in tcol.items()}
        # but we do care about self.tycol

def Analyse_table(self, table):
    # Renvoi un dico 'tycol' avec les types en clef et les noms des colonnes en valeur:
    Types_des_colonnes(table)
    nbr_types_colonnes_diff = len(self.tycol.keys())
    # ...

In method Types_de_colonnes , you need to do: self.tycol=tycol . Types_de_colonnes方法中,您需要做: self.tycol=tycol Also, you need to call the method "as a method". 另外,您需要将方法称为“作为方法”。 Take a week to read a book about python to learn some basics. 花一个星期阅读一本有关python的书,以学习一些基础知识。 Programming is easy, but not that easy :) 编程很容易,但并不容易:)

A class is a data structure that contains "data and the methods that operate on that data". 类是包含“数据和对该数据进行操作的方法”的数据结构。 Note, that I did not say 'functions' because a class always has access to data contained within the class, and therefore the methods in the class are not 'functions' in a mathematical sense. 请注意,我之所以没有说“函数”,是因为一个类始终可以访问该类中包含的数据,因此从数学意义上讲,该类中的方法不是“函数”。 But, That's for another day, perhaps. 但是,也许是另一天。

So, when do you use self ? 那么,您何时使用self self represents the actual instance of the class that you are invoking the method within. self表示您要在其中调用方法的类的实际实例。 So if you have a class called Shape and two instances of Shape a and b then when you call a.area() the self object inside the area method refers to the instance of Shape named a , where when you invoke b.area() the self object refers to the b instance of Shape 因此,如果您有一个名为Shape的类以及Shape ab两个实例,则在调用a.area()area方法内的self对象将引用名为aShape实例,在调用b.area() self对象是Shapeb实例

In this way you can write a method that works for any instance of Shape . 这样,您可以编写适用于Shape任何实例的方法。 To make this more concrete, here's an example Shape class: 为了更加具体,这是一个Shape类示例:

class Shape():
    def __init__(self, length_in, height_in):
        self.length = length_in
        self.height = height_in

    def area(self):
        return self.length * self.height

Here you can see that the data contained within the Shape class is length and height. 在这里,您可以看到Shape类中包含的数据是长度和高度。 Those values are assigned at the __init__ (in the constructor, ie. Shape a(3.0,4.0) ) And are assigned as members of self . 这些值在__init__ (在构造函数中,即Shape a(3.0,4.0) )处分配,并作为self成员分配。 Then, afterword they can be accessed by the method area though the self object, for calculations. 然后,可以通过self对象通过方法area访问后记,以进行计算。 These members can also be reassigned, and new members can be created. 也可以重新分配这些成员,并可以创建新成员。 (Generally though members are only created in the constructor). (尽管通常只在构造函数中创建成员)。

This is all very weird compared to the other simple aspects of Python design. 与Python设计的其他简单方面相比,这一切都很奇怪。 Yet, this is not unique to Python. 但是,这并不是Python独有的。 In C++ there is a this pointer, that serves the same purpose, and in JavaScript the way that closures are used to create objects often uses a this variable to perform the same task as Python's self . 在C ++中,有一个this指针,其作用相同,在JavaScript中,闭包用于创建对象的方式通常使用this变量来执行与Python的self相同的任务。

I hope this helps a little. 希望这会有帮助。 I can expand on any other questions you have. 我可以谈谈您还有其他任何问题。

Also, it's generally a good idea to do import statements at the top of the file. 同样,在文件顶部进行import语句通常也是一个好主意。 There are reasons not to, but none of them are good enough for normal coders to use. 有一些原因,但没有一个足以让普通编码人员使用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM