[英]Creating a class based on Pandas.DataFrame using the pandas.read_csv() function to initialize
My goal is to create an object that behaves the same as a Pandas DataFrame, but with a few extra methods of my own on top of it.我的目标是创建一个行为与 Pandas DataFrame 相同的对象,但在它之上有一些我自己的额外方法。 As far as I understand, one approach would be to extend the class, which I first tried to do as follows:
据我了解,一种方法是扩展类,我首先尝试按如下方式进行:
class CustomDF(pd.DataFrame):
def __init__(self, filename):
self = pd.read_csv(filename)
But I get errors when trying to view this object, saying: 'CustomDF' object has no attribute '_data'
.但是在尝试查看此对象时出现错误,说:
'CustomDF' object has no attribute '_data'
。
My second iteration was to instead not inherit the object, but rather import it as a DataFrame into one of the object attributes, and have the methods work around it, like this:我的第二次迭代不是继承对象,而是将其作为 DataFrame 导入对象属性之一,并让方法解决它,如下所示:
class CustomDF():
def __init__(self, filename):
self.df = pd.read_csv(filename)
def custom_method_1(self,a,b,...):
...
def custom_method_2(self,a,b,...):
...
This is fine, except that for all custom methods, I need to access the self.df
attribute first to do anything on it, but I would prefer that my custom dataframe were just self
.这很好,除了对于所有自定义方法,我需要首先访问
self.df
属性以对其执行任何操作,但我更希望我的自定义数据框只是self
。
Is there a way that this can be done?有没有办法做到这一点? Or is this approach not ideal anyway?
还是这种方法并不理想?
The __init__
method is overwritten in your first example. __init__
方法在您的第一个示例中被覆盖。
Use super
and then add your custom code使用
super
然后添加您的自定义代码
class CustomDF(pd.DataFrame):
def __init__(self, *args, **kw):
super(CustomDF, self).__init__(*args, **kw)
# Your code here
def custom_method_1(self,a,b,...):
...
Is this what you were looking for?这就是你要找的吗?
class CustomDF:
def __init__(self):
self.df = pd.read_csv(filename)
def custom_method_1(self, *args, **kwargs):
result_1 = do_custom_operations_on(self.df, *args, **kwargs)
return result_1
def custom_method_2(self, *args, **kwargs):
result_2 = do_custom_operations_on(self.df, *args, **kwargs)
return result_2
...
I would probably go with the decorator pattern here.我可能会在这里使用装饰器模式。 The accepted answer for this post will put you on the right track.
这篇文章的公认答案将使您走上正轨。
I see that your first iteration would be really cool, but it seem to me you need to know quite a lot of stuff about Pandas' internals, eg, that this _data
attribute need to be set in a certain way.我看到你的第一次迭代会很酷,但在我看来你需要了解很多关于 Pandas 内部的东西,例如,这个
_data
属性需要以某种方式设置。
Cheers.干杯。
In my project I did something similar and use decorators, like manu suggested.在我的项目中,我做了类似的事情并使用了装饰器,就像 manu 建议的那样。 The decorator
@property
might work for you, it basically turns the method .df()
into a property .df
.装饰器
@property
可能对您.df()
,它基本上将方法.df()
转换为属性.df
。 Therefore it will only be read in when it's called specifically.因此它只会在被特别调用时被读入。 But this only works on instances of the class.
但这仅适用于类的实例。
class CustomDF:
@property
def df(self):
return pd.read_csv(filename)
def custom_method_1(self, *args, **kwargs):
result_1 = do_custom_operations_on(self.df, *args, **kwargs)
return result_1
def custom_method_2(self, *args, **kwargs):
result_2 = do_custom_operations_on(self.df, *args, **kwargs)
return result_2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.