简体   繁体   English

如何将功能“附加”到 Python 中的对象,例如 Pandas DataFrame?

[英]How to "attach" functionality to objects in Python e.g. to pandas DataFrame?

Maybe this is more of a theoretical language question rather than pandas per-se.也许这更像是一个理论语言问题,而不是熊猫本身。 I have a set of function extensions that I'd like to "attach" to eg a pandas DataFrame without explicitly calling utility functions and passing the DataFrame as an argument ie to have the syntactic sugar.我有一组函数扩展,我想“附加”到例如 Pandas DataFrame 而不显式调用实用程序函数并将 DataFrame 作为参数传递,即具有语法糖。 Extending Pandas DataFrame is also not a choice because of the inaccessible types needed to define and chain the DataFrame contructor eg Axes and Dtype .扩展 Pandas DataFrame 也不是一种选择,因为需要定义和链接 DataFrame 构造函数的不可访问类型,例如AxesDtype

In Scala one can define an implicit class to attach functionality to an otherwise unavailable or too-complex-to-initialize object eg the String type can't be extended in Java AFAIR.在 Scala 中,可以定义一个隐式类来将功能附加到一个否则不可用或太复杂而无法初始化的对象上,例如 String 类型不能在 Java AFAIR 中扩展。 For example the following attaches a function to a String type dynamically https://www.oreilly.com/library/view/scala-cookbook/9781449340292/ch01s11.html例如,以下将函数动态附加到 String 类型https://www.oreilly.com/library/view/scala-cookbook/9781449340292/ch01s11.html

scala> implicit class StringImprovements(s: String) {
    def increment = s.map(c => (c + 1).toChar)
}

scala> val result = "HAL".increment   
result: String = IBM

Likewise, I'd like to be able to do:同样,我希望能够做到:

# somewhere in scope
def lexi_sort(df):
    """Lexicographically sorts the input pandas DataFrame by index and columns""" 
    df.sort_index(axis=0, level=df.index.names, inplace=True)
    df.sort_index(axis=1, level=df.columns.names, inplace=True)
    return df

df = pd.DataFrame(...)
# some magic and then ...
df.lexi_sort()

One valid possibility is to use the Decorator Pattern but I was wondering whether Python offered a less boiler-plate language alternative like Scala does.一种有效的可能性是使用装饰器模式,但我想知道 Python 是否提供了一种像 Scala 那样的样板语言替代品。

In pandas, you can do:在熊猫中,您可以执行以下操作:

def lexi_sort(df):
    """Lexicographically sorts the input pandas DataFrame by index and columns"""
    df.sort_index(axis=0, level=df.index.names, inplace=True)
    df.sort_index(axis=1, level=df.columns.names, inplace=True)
    return df

pd.DataFrame.lexi_sort = lexi_sort

df = pd.read_csv('dummy.csv')
df.lexi_sort()

I guess for other objects you can define a method within the class to achieve the same outcome.我想对于其他对象,您可以在类中定义一个方法来实现相同的结果。

class A():
    def __init__(self, df:pd.DataFrame):
        self.df = df
        self.n = 0

    def lexi_sort(self):
        """Lexicographically sorts the input pandas DataFrame by index and columns"""
        self.df.sort_index(axis=0, level=self.df.index.names, inplace=True)
        self.df.sort_index(axis=1, level=self.df.columns.names, inplace=True)
        return df

    def add_one(self):
        self.n += 1

a = A(df)
print(a.n)
a.add_one()
print(a.n)

Subclass DataFrame and don't do anything but add your feature .子类化 DataFrame 并且除了添加您的功能之外什么都不做。

import pd
import random,string

class Foo(pd.DataFrame):
    def lexi_sort(self):
        """Lexicographically sorts the input pandas DataFrame by index and columns""" 
        self.sort_index(axis=0, level=df.index.names, inplace=True)
        self.sort_index(axis=1, level=df.columns.names, inplace=True)

nrows = 10        
columns = ['b','d','a','c']
rows = [random.sample(string.ascii_lowercase,len(columns)) for _ in range(nrows)]
index = random.sample(string.ascii_lowercase,nrows)

df = Foo(rows,index,columns)

>>> df
   b  d  a  c
w  n  g  u  m
x  t  e  q  k
n  u  x  j  s
u  s  t  u  b
f  g  t  e  j
j  w  b  h  j
h  v  o  p  a
a  q  i  l  b
g  p  i  k  u
o  q  x  p  t
>>> df.lexi_sort()
>>> df
   a  b  c  d
a  l  q  b  i
f  e  g  j  t
g  k  p  u  i
h  p  v  a  o
j  h  w  j  b
n  j  u  s  x
o  p  q  t  x
u  u  s  b  t
w  u  n  m  g
x  q  t  k  e
>>

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在独立运行的 python 脚本之间共享 python 对象(例如 Pandas Dataframe) - Sharing python objects (e.g. Pandas Dataframe) between independently running python scripts python pandas dataframe 填充,例如 bfill、ffill - python pandas dataframe filling e.g. bfill, ffill python中的多处理-在多个进程之间共享大对象(例如pandas数据帧) - multiprocessing in python - sharing large object (e.g. pandas dataframe) between multiple processes 熊猫:在DataFrame问题中选择列-例如row [1] ['Column'] - Pandas: selecting columns in a DataFrame question - e.g. row[1]['Column'] 如何将零值添加到以日期时间为索引的 Pandas 数据框,例如用于后续绘图 - How to add zero values to datetime-indexed Pandas dataframe, e.g. for subsequent graphing 如何将出生年份的熊猫数据框列转换为年龄? (例如'1991'-> 28) - How can I convert a pandas dataframe column with birth year to age? (e.g. '1991' -> 28) 如何判断我是否已在 VS 代码中成功导入了 Python 包(例如 Pandas)? - How can I tell if I have successfully imported a package for python (e.g. Pandas) in VS code? 使用 Python (pandas, datetime) 在 dataframe 中查找事件(具有开始和结束时间)是否超过特定时间(例如下午 6 点) - Find whether an event (with a start & end time) goes beyond a certain time (e.g. 6pm) in dataframe using Python (pandas, datetime) 使用系列作为输入,如何在 Pandas 数据框中找到具有匹配值的行? 例如df.loc[系列]? - Using a series as input, how can I find rows with matching values in a pandas dataframe? e.g. df.loc[series]? 如何为ARM交叉编译python包(例如Numpy) - how to cross compile python packages (e.g. Numpy) for ARM
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM