简体   繁体   中英

Can I tie specific functions to Pandas dataframe columns?

Quick question, can I somehow tie a function to a dataframe column in Pandas? Ie if I create a dataframe, and then read a csv file into it, can I say that for column['x'] it will always run function y on the values in x when the data is loaded into the dataframe? For example, could I pass in a dictionary to the dataframe when instantiating the object that contains as key value pairs the column name and function?

pipe + transform

Tying functions to pd.DataFrame objects isn't how Pandas works. Much better is to define a function which takes your input dataframe and performs the manipulations you require. Then reuse the same function for other dataframes.

Since you have an input dictionary mapping column labels to functions, you can use transform for this purpose. Then use pipe to apply to an arbitrary number of input dataframes.

import pandas as pd, numpy as np

df1 = pd.DataFrame(np.arange(10).reshape((5, 2)))
df2 = pd.DataFrame(np.arange(10, 20).reshape((5, 2)))

def func1(x):
    return x + 100

def func2(x):
    return -x

def enrich_dataframe(mydf):
    d = {0: func1, 1: func2}
    return mydf.transform(d)

df1 = df1.pipe(enrich_dataframe)
df2 = df2.pipe(enrich_dataframe)

print(df1)

#      0  1
# 0  100 -1
# 1  102 -3
# 2  104 -5
# 3  106 -7
# 4  108 -9

print(df2)

#      0   1
# 0  110 -11
# 1  112 -13
# 2  114 -15
# 3  116 -17
# 4  118 -19

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM