Quick question, can I somehow tie a function to a dataframe column in Pandas? Ie if I create a dataframe, and then read a csv file into it, can I say that for column['x'] it will always run function y on the values in x when the data is loaded into the dataframe? For example, could I pass in a dictionary to the dataframe when instantiating the object that contains as key value pairs the column name and function?
pipe
+ transform
Tying functions to pd.DataFrame
objects isn't how Pandas works. Much better is to define a function which takes your input dataframe and performs the manipulations you require. Then reuse the same function for other dataframes.
Since you have an input dictionary mapping column labels to functions, you can use transform
for this purpose. Then use pipe
to apply to an arbitrary number of input dataframes.
import pandas as pd, numpy as np
df1 = pd.DataFrame(np.arange(10).reshape((5, 2)))
df2 = pd.DataFrame(np.arange(10, 20).reshape((5, 2)))
def func1(x):
return x + 100
def func2(x):
return -x
def enrich_dataframe(mydf):
d = {0: func1, 1: func2}
return mydf.transform(d)
df1 = df1.pipe(enrich_dataframe)
df2 = df2.pipe(enrich_dataframe)
print(df1)
# 0 1
# 0 100 -1
# 1 102 -3
# 2 104 -5
# 3 106 -7
# 4 108 -9
print(df2)
# 0 1
# 0 110 -11
# 1 112 -13
# 2 114 -15
# 3 116 -17
# 4 118 -19
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.