将公式应用于熊猫 Dataframe

Question

I have a very simple question.我有一个非常简单的问题。 I have a dataframe like this我有一个这样的 dataframe

In [19]: df = DataFrame(randn(10,2),columns=list('A'))

In [20]: df

Out[20]: 
          A  
0  0.958465  
1 -0.769077  
2  0.598059  
3  0.290926 
4 -0.248910 
5 -1.352096 
6  0.009125
7 -0.993082
8 -0.593704
9  0.523332

I would like to create a new column B with the following information:我想用以下信息创建一个新的B列：

          A              B
0  0.958465  
1 -0.769077  A1*A1+2*A0*A2
2  0.598059  A2*A2+2*A1*A3
3  0.290926  A3*A3+2*A2*A4
4 -0.248910  A4*A4+2*A3*A5
5 -1.352096  ...
6  0.009125  ...
7 -0.993082  ...
8 -0.593704  ...
9  0.523332  ...

It is a sort of convolution or autocorrelation but using everytime a different window. How can I define such a formula in Pandas?这是一种卷积或自相关，但每次都使用不同的 window。如何在 Pandas 中定义这样的公式？

Second question: how can I make variable the number of points involved in the formula (in the example I am just using the previous and the next point to make the calculation, but how can I pass a variable to say to pandas the number of points I want to use for the calculation)?第二个问题：如何使公式中涉及的点数可变（在示例中我只是使用前一个点和下一个点进行计算，但是我如何传递一个变量来表示 pandas 点数我想用于计算）？

Answer 1

You can make a function like this to allow a variable number of lags.您可以像这样制作 function 以允许可变数量的滞后。

def func(s, lags=1):
    return sum(s.shift(lag) * s.shift(-lag) for lag in range(lags+1))

df = pd.DataFrame({"A": [0.958465, -0.769077, 0.598059, 0.290926, -0.248910, -1.352096, 0.009125, 0.993082, -0.593704, 0.523332]})
df["B"] = func(df["A"], 1) # takes 1 point on either side
df["C"] = func(df["A"], 2) # takes 2 points on either side

Answer 2

df['B'] df['A']**2 + 2 * df['A'].shift() * df['A'].shift(-1)
df
          A         B
0  0.958465       NaN
1 -0.769077  1.737917
2  0.598059 -0.089814
3  0.290926 -0.213088
4 -0.248910 -0.724764
5 -1.352096  1.823621
6  0.009125  2.685568
7 -0.993082  0.975377
8 -0.593704 -0.686939
9  0.523332       NaN

Answer 3

I think, it can be done without apply too, in case:我认为，它也可以在不申请的情况下完成，以防万一：


import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10, 1), columns=['A'])

df['B'] = df['A'] ** + 2 * df['A'].shift(periods=1) * df['A'].shift(periods=-1)

print(df)

output: output：

          A         B
0  0.383006       NaN
1 -1.240859 -0.469964
2 -0.796920 -0.393244
3  0.499011  0.358630
4 -1.807221 -0.701899
5 -0.430667  0.296360
6 -0.884149  0.475958
7 -1.413762  3.492830
8 -1.976511  5.939588
9 -1.075428       NaN

将公式应用于熊猫 Dataframe

问题描述

3 个解决方案

解决方案1
1 已采纳 2022-04-25 21:28:12

解决方案2
0 2022-04-25 21:09:48

解决方案3
0 2022-04-25 21:37:59

将公式应用于熊猫 Dataframe

问题描述

3 个解决方案

解决方案1 1 已采纳 2022-04-25 21:28:12

解决方案2 0 2022-04-25 21:09:48

解决方案3 0 2022-04-25 21:37:59

解决方案1
1 已采纳 2022-04-25 21:28:12

解决方案2
0 2022-04-25 21:09:48

解决方案3
0 2022-04-25 21:37:59