Lambda function 不传递参数

Question

I have an example dataframe with columns 'one' and 'two' consisting of some random ints.我有一个示例 dataframe 列“一”和“二”由一些随机整数组成。 I was trying to understand some code with a lambda function in more depth and was puzzled that the code seems to magically work without providing an argument to be passed to the lambda function. I was trying to understand some code with a lambda function in more depth and was puzzled that the code seems to magically work without providing an argument to be passed to the lambda function.

Initially I am creating a new column 'newcol' with pandas assign() method and pass df into an explicit lambda function func(df).最初，我使用 pandas assign() 方法创建一个新列“newcol”，并将 df 传递给显式 lambda function func(df)。 The function returns the logs of the df's 'one' column: function 返回 df 的 'one' 列的日志：

df=df.assign(newcol=func(df))

So far so good.到目前为止，一切都很好。

However, what puzzles me is that the code works as well without passing df.然而，令我感到困惑的是，代码在不通过 df 的情况下也能正常工作。

df=df.assign(newcol2=func)

Even if I don't pass (df) into the lambda function, it correctly performs the operation.即使我没有将 (df) 传递到 lambda function 中，它也会正确执行操作。 How does the interpreter know that df is being passed into the lambda function?解释器如何知道 df 被传递到 lambda function 中？

Example code below and output:下面的示例代码和 output：

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(1,10,size=16).reshape(8,2),columns=["one","two"])
func=lambda x: np.log(x.one)
df=df.assign(newcol=func(df))
print(df)

#This one works too, but why?
df=df.assign(newcol2=func)
print(df)

Output:
   one  two    newcol   newcol2
0    1    8  0.000000  0.000000
1    6    7  1.791759  1.791759
2    2    6  0.693147  0.693147
3    2    8  0.693147  0.693147
4    4    2  1.386294  1.386294
5    9    3  2.197225  2.197225
6    2    2  0.693147  0.693147
7    4    7  1.386294  1.386294

(Note I could have used the lambda func inline of assign but have it here explicit for the sake of clarity.) （注意我本可以使用 lambda func inline 分配，但为了清楚起见，这里明确说明。）

Answer 1

If you use pd.DataFrame.assign() and pass on a callable , it assumes that the first argument is actually the dataframe itself.如果您使用pd.DataFrame.assign()并传递一个callable ，它假定第一个参数实际上是 dataframe 本身。

For example, if you change your code to the following:例如，如果您将代码更改为以下内容：

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(1,10,size=16).reshape(8,2),columns=["one","two"])
func=lambda c, x: np.log(x.one + c)
df=df.assign(newcol=func(1, df))
print(df)

#This one will no longer work!
df=df.assign(newcol2=func)
print(df)

the last call to assign() will not work.最后一次调用assign()将不起作用。

This is explained in the official documentation .这在官方文档中有解释。 The line df.assign(newcol=func(1, df)) uses the non-callable pathway, while the line df.assign(newcol=func) uses the callable pathway. df.assign(newcol=func(1, df))行使用不可调用路径，而df.assign(newcol=func)行使用可调用路径。

Answer 2

It's not compilation, it's simply how assign source code is written.这不是编译，它只是分配源代码的编写方式。 As mentioned in pandas assign documentation .如pandas 分配文档中所述。

Where the value is a callable, evaluated on df:其中值是可调用的，在 df 上评估：

Lambda function 不传递参数

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-10-15 10:02:37

解决方案2
0 2019-10-15 10:14:08

Lambda function 不传递参数

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-10-15 10:02:37

解决方案2 0 2019-10-15 10:14:08

解决方案1
1 已采纳 2019-10-15 10:02:37

解决方案2
0 2019-10-15 10:14:08