在数据框中使用的Lambda函数

Question

I have the following vector 我有以下向量

And I would like to implement a lambda function that given a vector element i , computes the mean value of i-3 ,i-2 i-1 and ith element. 我想实现一个给定向量元素i的lambda函数，计算出i-3，i-2 i-1和第ith个元素的平均值。 But I do not know how can I access the i-3, i-2, i-1 elements in the lambda function. 但是我不知道如何访问lambda函数中的i-3，i-2，i-1元素。

Answer 1

You can use the rolling() method to access the elements of a Pandas series within a specified window. 您可以使用rolling()方法访问指定窗口中的Pandas系列元素。 Then, you can use a lambda function to calculate the mean for the elements in that window. 然后，您可以使用lambda函数计算该窗口中元素的均值。 In order to include the three elements to the left of the current element, you use a window size of 4 : 为了在当前元素的左侧包括三个元素，您使用4的窗口大小：

In [39]: import pandas as pd

In [40]: S = pd.Series([3, 5, 6, 7, 4, 6, 7, 8])

In [41]: S.rolling(4).apply(lambda x: pd.np.mean(x))
Out[41]: 
0     NaN
1     NaN
2     NaN
3    5.25
4    5.50
5    5.75
6    6.00
7    6.25
dtype: float64

You'll note that there are missing values for the first three elements. 您会注意到前三个元素缺少值。 This is so because you can only start to form a window of the size 4 from the fourth element onwards. 之所以如此，是因为您只能从第四个元素开始形成大小为4的窗口。 However, if you want to calculate with smaller windows for the first elements, you can use the argument min_periods to specify the smallest valid window size: 但是，如果要为第一个元素使用较小的窗口进行计算，则可以使用参数min_periods指定最小的有效窗口大小：

In [42]: S.rolling(4, min_periods=1).apply(lambda x: pd.np.mean(x))
Out[42]: 
0    3.000000
1    4.000000
2    4.666667
3    5.250000
4    5.500000
5    5.750000
6    6.000000
7    6.250000
dtype: float64

Having said that , you don't need the lambda in the first place – I included it only because you explicitly asked for lambdas. 话虽如此 ，您首先不需要lambda －我之所以将它包括在内，仅是因为您明确要求lambda。 The method rolling() creates a Rolling object that has a built-in mean function that you can use, like so: rolling()方法创建一个Rolling对象，该对象具有可以使用的内置mean函数，如下所示：

In [43]: S.rolling(4).mean()
Out[43]: 
0     NaN
1     NaN
2     NaN
3    5.25
4    5.50
5    5.75
6    6.00
7    6.25
dtype: float64

Answer 2

if you want to do it on a pandas dataframe the easiest way is to use .loc, assuming you know the index position of i. 如果要在pandas数据帧上执行此操作，最简单的方法是使用.loc，假设您知道i的索引位置。

 import pandas as pd

 df = pd.DataFrame([3, 5, 6, 7, 4, 6, 7 ,8])
 setx = lambda x: df.loc[x:x-3:-1].mean()
 # x is the index position of your target value.
 > setx(4) # Without mean() gives values [4, 7, 6, 5]
 >> 5.5

Although if you want to stick with PEP8 standards it is best to define a function and avoid lambda in cases where (see python.org/dev/peps/pep-0008/#id50), assigning functions to an identifier by means of a lambda expression that is advised against in PEP8. 虽然如果您想坚持使用PEP8标准，则最好在以下情况下定义函数并避免使用lambda（请参阅python.org/dev/peps/pep-0008/#id50），通过lambda将函数分配给标识符在PEP8中建议不要使用此表达式。 Thank you @ Schmuddi for the clarification . 谢谢@ Schmuddi的澄清 。

在数据框中使用的Lambda函数

问题描述

2 个解决方案

解决方案1
3 2017-03-06 09:04:13

解决方案2
2 已采纳 2017-03-06 09:12:52

在数据框中使用的Lambda函数

问题描述

2 个解决方案

解决方案1 3 2017-03-06 09:04:13

解决方案2 2 已采纳 2017-03-06 09:12:52

解决方案1
3 2017-03-06 09:04:13

解决方案2
2 已采纳 2017-03-06 09:12:52