如何根据另一列中的多个条件创建新列

Question

In pandas, How can I create a new column B based on a column A in df , such that:在 pandas 中，如何根据df中的A列创建新的B列，这样：

B(i)=1 if A_(i-1)-A_(i) >= 0 when A_(i) <= 10 B(i)=1如果A_(i-1)-A_(i) >= 0当A_(i) <= 10
B(i)=1 if A_(i-1)-A_(i) >= 2 when 10 < A_(i) <= 20 B(i)=1如果A_(i-1)-A_(i) >= 2当10 < A_(i) <= 20
B(i)=1 if A_(i-1)-A_(i) >= 5 when 20 < A_(i) B(i)=1如果A_(i-1)-A_(i) >= 5当20 < A_(i)
B(i)=0 for any other case对于任何其他情况， B(i)=0

However, the first B_i value is always two但是，第一个B_i值总是两个

Example:例子：

A一种	B乙
5 5个	2 (the first B_i) 2（第一个B_i）
12 12	0 0
14 14	0 0
22 22	0 0
20 20	1 1个
33 33	0 0
11 11	1 1个
8 8个	1 1个
15 15	0 0
11 11	1 1个

Answer 1

You can use Pandas.shift for creating A_(i-1) and use Numpy.select for checking multiple conditions like below:您可以使用Pandas.shift创建A_(i-1)并使用Numpy.select检查多个条件，如下所示：

import pandas as pd
import numpy as np

df = pd.DataFrame({'A':[5,12,14,22,20,33,11,8,15,11]})
df['A_prv'] = df['A'].shift(1)

conditions = [
    (df.index==0),
    ((df['A_prv'] - df['A'] >= 0) & (df['A'].le(10))),
    ((df['A_prv'] - df['A'] >= 2) & (df['A'].between(10, 20, inclusive='right'))),
                                     # ^^^  10 < df['A'] <= 20 ^^^
    ((df['A_prv'] - df['A'] >= 5) & (df['A'].ge(20)))
]
choices = [2, 1, 1, 1]
df['B'] = np.select(conditions, choices, default=0)
print(df)

Output: Output：

    A  A_prv  B
0   5    NaN  2
1  12    5.0  0
2  14   12.0  0
3  22   14.0  0
4  20   22.0  1
5  33   20.0  0
6  11   33.0  1
7   8   11.0  1
8  15    8.0  0
9  11   15.0  1

Answer 2

The most intuitive way is to iterate trough the lines testing all the three conditions in a single-line if-else (as B(i) is 1 for all the true conditions).最直观的方法是在单行if-else中遍历测试所有三个条件的行（因为对于所有真实条件，B(i) 为 1）。

import pandas as pd

df = pd.DataFrame({'A':[5,12,14,22,20,33,11,8,15,11]})
B = [2]
for i in range(1,len(df['A'])):
    newvalue = 1 if (df['A'][i-1]-df['A'][i]>=0 and df['A'][i]<=10) or (df['A'][i-1]-df['A'][i]>=2 and df['A'][i]>10 and df['A'][i]<=20) or (df['A'][i-1]-df['A'][i]>=5 and df['A'][i]>20) else 0
    B.append(newvalue)
df['B'] = B
print(df)

Output: Output：

如何根据另一列中的多个条件创建新列

问题描述

2 个解决方案

解决方案1
2 已采纳 2022-03-31 16:42:29

解决方案2
1 2022-03-31 13:58:42

如何根据另一列中的多个条件创建新列

问题描述

2 个解决方案

解决方案1 2 已采纳 2022-03-31 16:42:29

解决方案2 1 2022-03-31 13:58:42

解决方案1
2 已采纳 2022-03-31 16:42:29

解决方案2
1 2022-03-31 13:58:42