如何将功能“附加”到 Python 中的对象，例如 Pandas DataFrame？

Question

也许这更像是一个理论语言问题，而不是熊猫本身。 我有一组函数扩展，我想“附加”到例如 Pandas DataFrame 而不显式调用实用程序函数并将 DataFrame 作为参数传递，即具有语法糖。 扩展 Pandas DataFrame 也不是一种选择，因为需要定义和链接 DataFrame 构造函数的不可访问类型，例如Axes和Dtype 。

在 Scala 中，可以定义一个隐式类来将功能附加到一个否则不可用或太复杂而无法初始化的对象上，例如 String 类型不能在 Java AFAIR 中扩展。 例如，以下将函数动态附加到 String 类型https://www.oreilly.com/library/view/scala-cookbook/9781449340292/ch01s11.html

scala> implicit class StringImprovements(s: String) {
    def increment = s.map(c => (c + 1).toChar)
}

scala> val result = "HAL".increment   
result: String = IBM

同样，我希望能够做到：

# somewhere in scope
def lexi_sort(df):
    """Lexicographically sorts the input pandas DataFrame by index and columns""" 
    df.sort_index(axis=0, level=df.index.names, inplace=True)
    df.sort_index(axis=1, level=df.columns.names, inplace=True)
    return df

df = pd.DataFrame(...)
# some magic and then ...
df.lexi_sort()

一种有效的可能性是使用装饰器模式，但我想知道 Python 是否提供了一种像 Scala 那样的样板语言替代品。

Answer 1

在熊猫中，您可以执行以下操作：

def lexi_sort(df):
    """Lexicographically sorts the input pandas DataFrame by index and columns"""
    df.sort_index(axis=0, level=df.index.names, inplace=True)
    df.sort_index(axis=1, level=df.columns.names, inplace=True)
    return df

pd.DataFrame.lexi_sort = lexi_sort

df = pd.read_csv('dummy.csv')
df.lexi_sort()

我想对于其他对象，您可以在类中定义一个方法来实现相同的结果。

class A():
    def __init__(self, df:pd.DataFrame):
        self.df = df
        self.n = 0

    def lexi_sort(self):
        """Lexicographically sorts the input pandas DataFrame by index and columns"""
        self.df.sort_index(axis=0, level=self.df.index.names, inplace=True)
        self.df.sort_index(axis=1, level=self.df.columns.names, inplace=True)
        return df

    def add_one(self):
        self.n += 1

a = A(df)
print(a.n)
a.add_one()
print(a.n)

Answer 2

子类化 DataFrame 并且除了添加您的功能之外什么都不做。

import pd
import random,string

class Foo(pd.DataFrame):
    def lexi_sort(self):
        """Lexicographically sorts the input pandas DataFrame by index and columns""" 
        self.sort_index(axis=0, level=df.index.names, inplace=True)
        self.sort_index(axis=1, level=df.columns.names, inplace=True)

nrows = 10        
columns = ['b','d','a','c']
rows = [random.sample(string.ascii_lowercase,len(columns)) for _ in range(nrows)]
index = random.sample(string.ascii_lowercase,nrows)

df = Foo(rows,index,columns)

>>> df
   b  d  a  c
w  n  g  u  m
x  t  e  q  k
n  u  x  j  s
u  s  t  u  b
f  g  t  e  j
j  w  b  h  j
h  v  o  p  a
a  q  i  l  b
g  p  i  k  u
o  q  x  p  t
>>> df.lexi_sort()
>>> df
   a  b  c  d
a  l  q  b  i
f  e  g  j  t
g  k  p  u  i
h  p  v  a  o
j  h  w  j  b
n  j  u  s  x
o  p  q  t  x
u  u  s  b  t
w  u  n  m  g
x  q  t  k  e
>>

如何将功能“附加”到 Python 中的对象，例如 Pandas DataFrame？

问题描述

2 个解决方案

解决方案1
5 已采纳 2020-10-02 16:01:51

解决方案2
3 2020-10-02 16:17:37

如何将功能“附加”到 Python 中的对象，例如 Pandas DataFrame？

问题描述

2 个解决方案

解决方案1 5 已采纳 2020-10-02 16:01:51

解决方案2 3 2020-10-02 16:17:37

解决方案1
5 已采纳 2020-10-02 16:01:51

解决方案2
3 2020-10-02 16:17:37