I want to apply my own multi-argument function to a Pandas data-frame (or a series within) using the data frame entries as the k-th argument in my N-argument function.
It only seems to work if I pass the dataframe as the first argument. I want to be able to pass the dataframe through one of the other arguments.
# A simple 3 argument function:
def my_func(a, b, c):
return (a/b)**c
# Data-Frame:
d1 = {
'column1': [1.1, 1.2, 1.3, ],
'column2': [2.1, 2.2, 2.3, ]
}
df = pd.DataFrame(d1, index = [1, 2, 3])
# I can apply it fine if I pass the columns as the first argument i.e. "a":
df.apply(my_func, b=7, c=9)
# However, I am unable to pass the columns through arguments "b" or "c":
df.apply(my_func, a = 7, c = 9)
This returns a TypeError: ("my_func() got multiple values for argument 'a'", 'occurred at index column1')
I want to be able to pass the columns of the data frame (or series) through any of the arguments of my own multi-argument function. Is there a simple/intuitive (non-hack-like) way of doing this?
If I understand you correctly all you need is:
my_func(df['column1'], b=7, c=5)
Pandas series can be multiplied/divided/taken to power of a constant, returning a series of the same size.
In a more sophisticated scenario, when scalar operations aren't enough, it could also be written as something like:
df.apply(lambda row: my_func(row['column1'], b=7, c=5), axis=1)
Here axis=1
tells Pandas to apply this function to row instead of column(default)
To apply this function elementwise, you can use df.appymap
instead:
df.applymap(lambda value: my_func(7, value, c=5)
However, it will be much faster if my_func
or its input could be adjusted to use vectors instead:
my_func(np.ones(df.shape) * 7, df, c=5)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.