确定列值是否在基于另一列的条件范围之间

Question

I have a dataframe that looks as follows: 我有一个数据框，如下所示：

    data = np.array([[5, 'red', 2,6, 8, 10],
                 [11, 'red', 3,9,6,15],
                 [8, 'blue', 0, 3, 5, 10],
                 [2, 'blue', 1, 2, 3, 4]])
    df = pd.DataFrame(data, columns = ['A','B','red_lower', 'red_upper', 'blue_lower', 'blue_upper'])

    A     B red_lower red_upper blue_lower blue_upper
0   5   red         2         6          8         10
1  11   red         3         9          6         15
2   8  blue         0         3          5         10
3   2  blue         1         2          3          4

I'd like to create an additional column that tells me if the value in a column A is in the range of the color given in column B. For example, in row 0, since 5 has the designation red, I will check if 5 is between 2 and 6. It is, so I will have the new column have a 1. 我想创建一个额外的列，告诉我A列中的值是否在B列中给出的颜色范围内。例如，在第0行中，由于5的名称为红色，我将检查是否为5是2到6之间。所以我将新列有一个1。

Desired result: 期望的结果：

    A    B   red_lower red_upper blue_lower blue_upper in_range
0   5   red         2         6          8         10        1
1  11   red         3         9          6         15        0
2   8  blue         0         3          5         10        1
3   2  blue         1         2          3          4        0

I've tried to write a loop, but I'm getting many series errors. 我试过写一个循环，但是我遇到了很多系列错误。 I really dont want to have to split up the dataframe (by color), but maybe that's the way to go? 我真的不想分开数据框（按颜色），但也许这是要走的路？ (in my actual dataframe, there are six different 'colors', not just two). （在我的实际数据框中，有六种不同的'颜色'，而不仅仅是两种）。

Thank you! 谢谢！

EDIT: bonus if we have the additional column tell me if the value is above or below the range! 编辑：奖金，如果我们有额外的列告诉我，如果值高于或低于范围！ For example, in row 1, 11 is outside the range, so is too high. 例如，在第1行中，11超出范围，因此太高。 Table should look this way: 表应该是这样的：

    A     B red_lower red_upper blue_lower blue_upper in_range
0   5   red         2         6          8         10   inside
1  11   red         3         9          6         15    above
2   8  blue         0         3          5         10   inside
3   2  blue         1         2          3          4    below

Answer 1

`justify` + `broadcast` + `mask` + `logical_and` `justify` + `broadcast` + `mask` + `logical_and`

You can use some nifty broadcasting here, and the function justify from another answer. 你可以在这里使用一些漂亮的广播，并从另一个答案justify这个功能。 This assumes that each color has a single valid range. 这假设每种颜色都有一个有效范围。 It also assumes that all of your numeric columns are in fact numeric . 它还假设您的所有数字列实际上都是数字 。

values = df.A.values
colors = df.B.values

range_frame = df.iloc[:, 2:]
ranges = range_frame.columns.str.split('_').str[0].values

m = colors != ranges[:, None]
masked = range_frame.mask(m)

jf = justify(masked.values, invalid_val=np.nan)[:, :2]
ir = np.logical_and(jf[:, 0] < values, values < jf[:, 1]).astype(int)

c1 = values <= jf[:, 0]
c2 = values >= jf[:, 1]

irl = np.select([c1, c2], ['below', 'above'], 'inside')

df.assign(in_range=ir, in_range_flag=irl)

    A     B  red_lower  red_upper  blue_lower  blue_upper  in_range in_range_flag
0   5   red          2          6           8          10         1        inside
1  11   red          3          9           6          15         0         above
2   8  blue          0          3           5          10         1        inside
3   3  blue          1          2           3           4         0         below

`stack` + `reshape` + `logical_and` `stack` + `reshape` + `logical_and`

Again making the same assumptions as the first answer. 再次做出与第一个答案相同的假设。

u = df.set_index(['A', 'B']).stack().rename_axis(['A', 'B', 'flag']).reset_index()
frame = u[u.flag.str.split('_').str[0] == u.B]

values = frame[::2].A.values
ranges = frame[0].values.reshape(-1, 2)

ir = np.logical_and(ranges[:, 0] < values, values < ranges[:, 1])

c1 = values <= ranges[:, 0]
c2 = values >= ranges[:, 1]

irl = np.select([c1, c2], ['below', 'above'], 'inside')

df.assign(in_range=ir, in_range_flag=irl)

Here is the definition for the justify function by @Divakar: 以下是@Divakar的justify函数的定义：

def justify(a, invalid_val=0, axis=1, side='left'):    
    """
    Justifies a 2D array

    Parameters
    ----------
    A : ndarray
        Input array to be justified
    axis : int
        Axis along which justification is to be made
    side : str
        Direction of justification. It could be 'left', 'right', 'up', 'down'
        It should be 'left' or 'right' for axis=1 and 'up' or 'down' for axis=0.

    """

    if invalid_val is np.nan:
        mask = ~np.isnan(a)
    else:
        mask = a!=invalid_val
    justified_mask = np.sort(mask,axis=axis)
    if (side=='up') | (side=='left'):
        justified_mask = np.flip(justified_mask,axis=axis)
    out = np.full(a.shape, invalid_val) 
    if axis==1:
        out[justified_mask] = a[mask]
    else:
        out.T[justified_mask.T] = a.T[mask.T]
    return out

Answer 2

Here is using groupby split the df and most of step handled by the definition , which means you do not need input the different color each time 这里使用groupby分割df和大部分步骤由定义处理，这意味着你不需要每次都输入不同的颜色

l=[]
for name,x  in df.groupby('B',sort=False):
    s1=(x.A >= x.filter(like=name).iloc[:, 0]) & (x.A <= x.filter(like=name).iloc[:, 1])
    s2=x.A<x.filter(like=name).iloc[:, 0]
    l.extend(np.select([s1,s2],['inside','below'],default='above').tolist())

df['in_range']=l
df
Out[64]: 
    A     B  red_lower  red_upper  blue_lower  blue_upper in_range
0   5   red          2          6           8          10   inside
1  11   red          3          9           6          15    above
2   8  blue          0          3           5          10   inside
3   2  blue          1          2           3           4    below

确定列值是否在基于另一列的条件范围之间

问题描述

2 个解决方案

解决方案1
3 2019-07-08 21:22:14

`justify` + `broadcast` + `mask` + `logical_and` `justify` + `broadcast` + `mask` + `logical_and`

`stack` + `reshape` + `logical_and` `stack` + `reshape` + `logical_and`

解决方案2
3 2019-07-08 21:22:39

确定列值是否在基于另一列的条件范围之间

问题描述

2 个解决方案

解决方案1 3 2019-07-08 21:22:14

justify + broadcast + mask + logical_and justify + broadcast + mask + logical_and

stack + reshape + logical_and stack + reshape + logical_and

解决方案2 3 2019-07-08 21:22:39

解决方案1
3 2019-07-08 21:22:14

`justify` + `broadcast` + `mask` + `logical_and` `justify` + `broadcast` + `mask` + `logical_and`

`stack` + `reshape` + `logical_and` `stack` + `reshape` + `logical_and`

解决方案2
3 2019-07-08 21:22:39