简体   繁体   中英

How to apply a row-wise function to a pandas dataframe and a shifted version of itself

I have a pandas dataframe where I would like to apply a simple sign and multiply operation to each row and the row two indices back (shifted by 2). For example if we had

row_a = np.array([0.45, -0.78, 0.92])
row_b = np.array([1.2, -0.73, -0.46])
sgn_row_a = np.sign(row_a)
sgn_row_b = np.sign(row_b)
result = sgn_row_a * sgn_row_b
result
>>> array([1., 1., -1.])

What I have tried

import pandas as pd
import numpy as np

np.random.seed(42)
df = pd.DataFrame(np.random.normal(0, 1, (100, 5)), columns=["a", "b", "c", "d", "e"])

def kernel(row_a, row_b):
    """Take the sign of both rows and multiply them"""
    sgn_a = np.sign(row_a)
    sgn_b = np.sign(row_b)
    return sgn_a * sgn_b

def func(data):
    """Apply 'kernel' to the dataframe row-wise, axis=1"""
    out = data.apply(lambda x: kernel(x, x.shift(2)), axis=1)
    return out

But then when I run the function I get the below as output which is incorrect. It seems to shift the columns rather than the rows. But when I tried different axis in the shift operation, I just got errors ( ValueError: No axis named 1 for object type Series )

out = func(df)
out
>>>
      a   b    c    d    e
0   NaN NaN  1.0 -1.0 -1.0
1   NaN NaN -1.0 -1.0  1.0
2   NaN NaN -1.0  1.0 -1.0
3   NaN NaN -1.0  1.0 -1.0
4   NaN NaN  1.0  1.0 -1.0
..   ..  ..  ...  ...  ...

What I expect is

out = func(df)
out
>>>
      a   b    c    d    e
0    -1.  1.   1.  -1.   1.
1     1. -1.   1.   1.  -1.
2    -1.  1.   1.   1.   1.
3    -1.  1.   1.   1.   1.
4    -1. -1.  -1.   1.  -1.
..   ..  ..  ...  ...  ...

How can I achieve a shifted row-wise operation as I have outlined above?

It seems the simplest way to do this prticular operation is

df.apply(np.sign) * df.shift(2).apply(np.sign)
>>>
       a    b    c    d    e
0    NaN  NaN  NaN  NaN  NaN
1    NaN  NaN  NaN  NaN  NaN
2   -1.0  1.0  1.0 -1.0  1.0
3    1.0 -1.0  1.0  1.0 -1.0
4   -1.0  1.0  1.0  1.0  1.0
..   ...  ...  ...  ...  ...

And just apply a negative sign to the shift to shift the other way.

apply is for loop by columns, here is possible pass DataFrame to np.sign function:

df = np.sign(df) * np.sign(df.shift(2))
print (df)
      a    b    c    d    e
0   NaN  NaN  NaN  NaN  NaN
1   NaN  NaN  NaN  NaN  NaN
2  -1.0  1.0  1.0 -1.0  1.0
3   1.0 -1.0  1.0  1.0 -1.0
4  -1.0  1.0  1.0  1.0  1.0
..  ...  ...  ...  ...  ...
95  1.0  1.0  1.0 -1.0 -1.0
96  1.0  1.0  1.0  1.0 -1.0
97  1.0 -1.0 -1.0  1.0  1.0
98  1.0 -1.0 -1.0 -1.0 -1.0
99 -1.0  1.0  1.0 -1.0 -1.0

[100 rows x 5 columns]

then if need remove first NaN s rows:

#df = df.dropna()
df = df.iloc[2:]
print (df)
      a    b    c    d    e
2  -1.0  1.0  1.0 -1.0  1.0
3   1.0 -1.0  1.0  1.0 -1.0
4  -1.0  1.0  1.0  1.0  1.0
5  -1.0  1.0  1.0  1.0  1.0
6  -1.0 -1.0 -1.0  1.0 -1.0
..  ...  ...  ...  ...  ...
95  1.0  1.0  1.0 -1.0 -1.0
96  1.0  1.0  1.0  1.0 -1.0
97  1.0 -1.0 -1.0  1.0  1.0
98  1.0 -1.0 -1.0 -1.0 -1.0
99 -1.0  1.0  1.0 -1.0 -1.0

[98 rows x 5 columns]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM