简体   繁体   中英

pandas multiple conditions based on multiple columns

I am trying to color points of a pandas dataframe depending on TWO conditions. Example:

IF value of col1 > a AND value of col2 - value of col3 < b THEN value of col4 = string
ELSE value of col4 = other string.

I have tried so many different ways now and everything I found online was only depending on one condition.

My example code always raises the Error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Here's the code. Tried several variations without success.

df = pd.DataFrame()

df['A'] = range(10)
df['B'] = range(11,21,1)
df['C'] = range(20,10,-1)

borderE = 3.
ex = 0.

#print df

df['color'] = np.where(all([df.A < borderE, df.B - df.C < ex]), 'r', 'b')

Btw: I understand, what it says but not how to handle it.

Selection criteria uses Boolean indexing :

df['color'] = np.where(((df.A < borderE) & ((df.B - df.C) < ex)), 'r', 'b')

>>> df
   A   B   C color
0  0  11  20     r
1  1  12  19     r
2  2  13  18     r
3  3  14  17     b
4  4  15  16     b
5  5  16  15     b
6  6  17  14     b
7  7  18  13     b
8  8  19  12     b
9  9  20  11     b

wrap the IF in a function and apply it:

def color(row):
    borderE = 3.
    ex = 0.
    if (row.A > borderE) and( row.B - row.C < ex) :
        return "somestring"
    else:
        return "otherstring"

df.loc[:, 'color'] = df.apply(color, axis = 1)

Yields:

  A   B   C        color
0  0  11  20  otherstring
1  1  12  19  otherstring
2  2  13  18  otherstring
3  3  14  17  otherstring
4  4  15  16   somestring
5  5  16  15  otherstring
6  6  17  14  otherstring
7  7  18  13  otherstring
8  8  19  12  otherstring
9  9  20  11  otherstring

for me @Alexander 's solution didn't work in my specific case, I had to use lists to pass the two conditions, and then transpose the output, My solution also work for this case:

df['color'] = np.where([df.A < borderE] and [df.B - df.C < ex], 'r', 'b').transpose()

Yields:

    A   B   C   color
0   0   11  20  r
1   1   12  19  r
2   2   13  18  r
3   3   14  17  b
4   4   15  16  b
5   5   16  15  b
6   6   17  14  b
7   7   18  13  b
8   8   19  12  b
9   9   20  11  b

The error in the question occurred because OP used all() function instead of the bitwise- & operator to chain multiple comparisons together.

Another way to chain multiple comparisons is to evaluate an expression via eval() method. The following checks if A value is less than borderE AND (B - C) value is less than ex for each row.

df.eval('A < @borderE and B - C < @ex')

Using its outcome as a numpy.where() condition, you can assign values by

df['color'] = np.where(df.eval('A < @borderE and B - C < @ex'), 'r', 'b')

If you installed numexpr ( pip install numexpr ), this method should perform as well as chaining via & . The advantage is that (i) it's much more readable (imo) and (ii) you don't need to worry about brackets () , and / & etc. anymore because the order of precedence inside the string expression is the same as that in Python .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM