简体   繁体   English

当存在多个变量时,使用lambda在python数据帧中实现if-else

[英]Implementing if-else in python dataframe using lambda when there are multiple variables

I am trying to implement if-elif or if-else logic in python while working on a dataframe. 我正在尝试在处理数据帧时在python中实现if-elif或if-else逻辑。 I am struggling when working with more than one column. 在使用多个专栏时,我很挣扎。

sample data frame 样本数据框

df=pd.DataFrame({"one":[1,2,3,4,5],"two":[6,7,8,9,10], "name": 'a', 'b', 'a', 'b', 'c'})

If my if-else logic is based on only one column - I know how to do it. 如果我的if-else逻辑仅基于一列 - 我知道如何做到这一点。

df['one'] = df["one"].apply(lambda x: x*10 if x<2 else (x**2 if x<4 else x+10))

But I want to modify column 'one' based on values of column 'two' - and I feel its going be something like this - 但是我想根据列'two'的值修改列'one' - 我觉得它会像这样 -

lambda x, y: x*100 if y>8 else (x*1 if y<8 else x**2)

But I am not sure how to specify the second column. 但我不知道如何指定第二列。 I tried this way but obviously that's incorrect 我试过这种方式,但显然这是不正确的

df['one'] = df["one"]["two"].apply(lambda x, y: x*100 if y>8 else (x*1 if y<8 else x**2))

Question 1 - what'd be the correct syntax for the above code ? 问题1 - 上述代码的正确语法是什么?

Question 2 - How to implement below logic using lambda ? 问题2 - 如何使用lambda实现以下逻辑?

if df['name'].isin(['a','b'])  df['one'] = 100 else df['one'] = df['two']

If I write something like x.isin(['a','b']) it won't work. 如果我写一些像x.isin(['a','b'])的东西,它将无法工作。

Apply across columns 跨列应用

Use pd.DataFrame.apply instead of pd.Series.apply and specify axis=1 : 使用pd.DataFrame.apply而不是pd.Series.apply并指定axis=1

df['one'] = df.apply(lambda row: row['one']*100 if row['two']>8 else \
                     (row['one']*1 if row['two']<8 else row['one']**2), axis=1)

Unreadable? 不可读? Yes, I agree. 是的我同意。 Let's try again but this time rewrite as a named function. 让我们再试一次,但这次重写为命名函数。

Using a function 使用功能

Note lambda is just an anonymous function. 注意lambda只是一个匿名函数。 We can define a function explicitly and use it with pd.DataFrame.apply : 我们可以显式定义一个函数,并将其与pd.DataFrame.apply一起pd.DataFrame.apply

def calc(row):
    if row['two'] > 8:
        return row['one'] * 100
    elif row['two'] < 8:
        return row['one']
    else:
        return row['one']**2

df['one'] = df.apply(calc, axis=1)

Readable? 可读? Yes. 是。 But this isn't vectorised. 但这不是矢量化的。 We're looping through each row one at at at time. 我们正在循环遍历每一行。 We might as well have used a list. 我们不妨使用一个列表。 Pandas isn't just for clever table formatting, you can use it for vectorised calculations using arrays in contiguous memory blocks. Pandas不仅适用于聪明的表格格式,您可以使用它在连续内存块中使用数组进行矢量化计算。 So let's try one more time. 让我们再试一次吧。

Vectorised calculations 矢量化计算

Using numpy.where : 使用numpy.where

df['one'] = np.where(row['two'] > 8, row['one'] * 100,
                     np.where(row['two'] < 8, row['one'],
                              row['one']**2))

There we go. 我们走了。 Readable and efficient. 可读高效。 We have effectively vectorised our if / else statements. 我们已经有效地对if / else语句进行了矢量化。 Does this mean that we are doing more calculations than necessary? 这是否意味着我们正在进行比必要更多的计算? Yes! 是! But this is more than offset by the way in which we are performing the calculations, ie with well-defined blocks of memory rather than pointers. 但是,这是更不是由我们与记忆,而不是指针定义良好的块进行计算,即道路偏移。 You will find an order of magnitude performance improvement. 您会发现性能提升一个数量级。

Another example 另一个例子

Well, we can just use numpy.where again. 好吧,我们可以再次使用numpy.where

df['one'] = np.where(df['name'].isin(['a', 'b']), 100, df['two'])

you can do 你可以做

df.apply(lambda x: x["one"] + x["two"], axis=1)

but i don't think that such a long lambda as lambda x: x["one"]*100 if x["two"]>8 else (x["one"]*1 if x["two"]<8 else x["one"]**2) is very pythonic. 但我不认为这样长的lambda如lambda x: x["one"]*100 if x["two"]>8 else (x["one"]*1 if x["two"]<8 else x["one"]**2)非常pythonic。 apply takes any callback: apply接受任何回调:

def my_callback(x):
    if x["two"] > 8:
        return x["one"]*100
    elif x["two"] < 8:
        return x["one"]
    else:
        return x["one"]**2

df.apply(my_callback, axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM