[英]How to "attach" functionality to objects in Python e.g. to pandas DataFrame?
Maybe this is more of a theoretical language question rather than pandas per-se.也许这更像是一个理论语言问题,而不是熊猫本身。 I have a set of function extensions that I'd like to "attach" to eg a pandas DataFrame without explicitly calling utility functions and passing the DataFrame as an argument ie to have the syntactic sugar.
我有一组函数扩展,我想“附加”到例如 Pandas DataFrame 而不显式调用实用程序函数并将 DataFrame 作为参数传递,即具有语法糖。 Extending Pandas DataFrame is also not a choice because of the inaccessible types needed to define and chain the DataFrame contructor eg
Axes
and Dtype
.扩展 Pandas DataFrame 也不是一种选择,因为需要定义和链接 DataFrame 构造函数的不可访问类型,例如
Axes
和Dtype
。
In Scala one can define an implicit class to attach functionality to an otherwise unavailable or too-complex-to-initialize object eg the String type can't be extended in Java AFAIR.在 Scala 中,可以定义一个隐式类来将功能附加到一个否则不可用或太复杂而无法初始化的对象上,例如 String 类型不能在 Java AFAIR 中扩展。 For example the following attaches a function to a String type dynamically https://www.oreilly.com/library/view/scala-cookbook/9781449340292/ch01s11.html
例如,以下将函数动态附加到 String 类型https://www.oreilly.com/library/view/scala-cookbook/9781449340292/ch01s11.html
scala> implicit class StringImprovements(s: String) {
def increment = s.map(c => (c + 1).toChar)
}
scala> val result = "HAL".increment
result: String = IBM
Likewise, I'd like to be able to do:同样,我希望能够做到:
# somewhere in scope
def lexi_sort(df):
"""Lexicographically sorts the input pandas DataFrame by index and columns"""
df.sort_index(axis=0, level=df.index.names, inplace=True)
df.sort_index(axis=1, level=df.columns.names, inplace=True)
return df
df = pd.DataFrame(...)
# some magic and then ...
df.lexi_sort()
One valid possibility is to use the Decorator Pattern but I was wondering whether Python offered a less boiler-plate language alternative like Scala does.一种有效的可能性是使用装饰器模式,但我想知道 Python 是否提供了一种像 Scala 那样的样板语言替代品。
In pandas, you can do:在熊猫中,您可以执行以下操作:
def lexi_sort(df):
"""Lexicographically sorts the input pandas DataFrame by index and columns"""
df.sort_index(axis=0, level=df.index.names, inplace=True)
df.sort_index(axis=1, level=df.columns.names, inplace=True)
return df
pd.DataFrame.lexi_sort = lexi_sort
df = pd.read_csv('dummy.csv')
df.lexi_sort()
I guess for other objects you can define a method within the class to achieve the same outcome.我想对于其他对象,您可以在类中定义一个方法来实现相同的结果。
class A():
def __init__(self, df:pd.DataFrame):
self.df = df
self.n = 0
def lexi_sort(self):
"""Lexicographically sorts the input pandas DataFrame by index and columns"""
self.df.sort_index(axis=0, level=self.df.index.names, inplace=True)
self.df.sort_index(axis=1, level=self.df.columns.names, inplace=True)
return df
def add_one(self):
self.n += 1
a = A(df)
print(a.n)
a.add_one()
print(a.n)
Subclass DataFrame and don't do anything but add your feature .子类化 DataFrame 并且除了添加您的功能之外什么都不做。
import pd
import random,string
class Foo(pd.DataFrame):
def lexi_sort(self):
"""Lexicographically sorts the input pandas DataFrame by index and columns"""
self.sort_index(axis=0, level=df.index.names, inplace=True)
self.sort_index(axis=1, level=df.columns.names, inplace=True)
nrows = 10
columns = ['b','d','a','c']
rows = [random.sample(string.ascii_lowercase,len(columns)) for _ in range(nrows)]
index = random.sample(string.ascii_lowercase,nrows)
df = Foo(rows,index,columns)
>>> df
b d a c
w n g u m
x t e q k
n u x j s
u s t u b
f g t e j
j w b h j
h v o p a
a q i l b
g p i k u
o q x p t
>>> df.lexi_sort()
>>> df
a b c d
a l q b i
f e g j t
g k p u i
h p v a o
j h w j b
n j u s x
o p q t x
u u s b t
w u n m g
x q t k e
>>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.