Maybe this is more of a theoretical language question rather than pandas per-se. I have a set of function extensions that I'd like to "attach" to eg a pandas DataFrame without explicitly calling utility functions and passing the DataFrame as an argument ie to have the syntactic sugar. Extending Pandas DataFrame is also not a choice because of the inaccessible types needed to define and chain the DataFrame contructor eg Axes
and Dtype
.
In Scala one can define an implicit class to attach functionality to an otherwise unavailable or too-complex-to-initialize object eg the String type can't be extended in Java AFAIR. For example the following attaches a function to a String type dynamically https://www.oreilly.com/library/view/scala-cookbook/9781449340292/ch01s11.html
scala> implicit class StringImprovements(s: String) {
def increment = s.map(c => (c + 1).toChar)
}
scala> val result = "HAL".increment
result: String = IBM
Likewise, I'd like to be able to do:
# somewhere in scope
def lexi_sort(df):
"""Lexicographically sorts the input pandas DataFrame by index and columns"""
df.sort_index(axis=0, level=df.index.names, inplace=True)
df.sort_index(axis=1, level=df.columns.names, inplace=True)
return df
df = pd.DataFrame(...)
# some magic and then ...
df.lexi_sort()
One valid possibility is to use the Decorator Pattern but I was wondering whether Python offered a less boiler-plate language alternative like Scala does.
In pandas, you can do:
def lexi_sort(df):
"""Lexicographically sorts the input pandas DataFrame by index and columns"""
df.sort_index(axis=0, level=df.index.names, inplace=True)
df.sort_index(axis=1, level=df.columns.names, inplace=True)
return df
pd.DataFrame.lexi_sort = lexi_sort
df = pd.read_csv('dummy.csv')
df.lexi_sort()
I guess for other objects you can define a method within the class to achieve the same outcome.
class A():
def __init__(self, df:pd.DataFrame):
self.df = df
self.n = 0
def lexi_sort(self):
"""Lexicographically sorts the input pandas DataFrame by index and columns"""
self.df.sort_index(axis=0, level=self.df.index.names, inplace=True)
self.df.sort_index(axis=1, level=self.df.columns.names, inplace=True)
return df
def add_one(self):
self.n += 1
a = A(df)
print(a.n)
a.add_one()
print(a.n)
Subclass DataFrame and don't do anything but add your feature .
import pd
import random,string
class Foo(pd.DataFrame):
def lexi_sort(self):
"""Lexicographically sorts the input pandas DataFrame by index and columns"""
self.sort_index(axis=0, level=df.index.names, inplace=True)
self.sort_index(axis=1, level=df.columns.names, inplace=True)
nrows = 10
columns = ['b','d','a','c']
rows = [random.sample(string.ascii_lowercase,len(columns)) for _ in range(nrows)]
index = random.sample(string.ascii_lowercase,nrows)
df = Foo(rows,index,columns)
>>> df
b d a c
w n g u m
x t e q k
n u x j s
u s t u b
f g t e j
j w b h j
h v o p a
a q i l b
g p i k u
o q x p t
>>> df.lexi_sort()
>>> df
a b c d
a l q b i
f e g j t
g k p u i
h p v a o
j h w j b
n j u s x
o p q t x
u u s b t
w u n m g
x q t k e
>>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.