简体   繁体   中英

Alternatives to looping through a function taking inputs from several Pandas series

I have been using Pandas for a while but have not came across a need to do this until now. Here's the setup. I have several Pandas series (with their indices exactly identical), say A, B and C, and a complicated function func(). What I am trying to do (in a non-Pandas-efficient way) is iterate through the index of the series applying func().

D = pandas.Series(index=A.index) # First create an empty Series
for i in range(len(A)):
    D[i] = func(A[i], B[i], C[i])

Is there a Pandas-efficient way of doing the above that takes into account that this is essentially an array-based operation? I looked at pandas.DataFrame.apply but the examples show application of simple functions such as numpy.sqrt() that take only one series argument.

If you have only a pd.Series your function should return a series as well.

Therefore,

D = func(A, B, C)

should yield D as a pd.Series which is a vectorized result over the A, B and C values.

If you want a new column on a DataFrame you could solve it this way:

df.loc[:,'new column'] = \
        df.loc[:,'data column'].\
            apply(lambda x: func(x, A[x.name], B[x.name], C[x.name]), axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM