简体   繁体   English

如何根据另一列中的多个条件创建新列

[英]How to create a new column based on multiple conditions in another column

In pandas, How can I create a new column B based on a column A in df , such that:在 pandas 中,如何根据df中的A列创建新的B列,这样:

  • B(i)=1 if A_(i-1)-A_(i) >= 0 when A_(i) <= 10 B(i)=1如果A_(i-1)-A_(i) >= 0A_(i) <= 10
  • B(i)=1 if A_(i-1)-A_(i) >= 2 when 10 < A_(i) <= 20 B(i)=1如果A_(i-1)-A_(i) >= 210 < A_(i) <= 20
  • B(i)=1 if A_(i-1)-A_(i) >= 5 when 20 < A_(i) B(i)=1如果A_(i-1)-A_(i) >= 520 < A_(i)
  • B(i)=0 for any other case对于任何其他情况, B(i)=0

However, the first B_i value is always two但是,第一个B_i值总是两个

Example:例子:

A一种 B
5 5个 2 (the first B_i) 2(第一个B_i)
12 12 0 0
14 14 0 0
22 22 0 0
20 20 1 1个
33 33 0 0
11 11 1 1个
8 8个 1 1个
15 15 0 0
11 11 1 1个

You can use Pandas.shift for creating A_(i-1) and use Numpy.select for checking multiple conditions like below:您可以使用Pandas.shift创建A_(i-1)并使用Numpy.select检查多个条件,如下所示:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A':[5,12,14,22,20,33,11,8,15,11]})
df['A_prv'] = df['A'].shift(1)

conditions = [
    (df.index==0),
    ((df['A_prv'] - df['A'] >= 0) & (df['A'].le(10))),
    ((df['A_prv'] - df['A'] >= 2) & (df['A'].between(10, 20, inclusive='right'))),
                                     # ^^^  10 < df['A'] <= 20 ^^^
    ((df['A_prv'] - df['A'] >= 5) & (df['A'].ge(20)))
]
choices = [2, 1, 1, 1]
df['B'] = np.select(conditions, choices, default=0)
print(df)

Output: Output:

    A  A_prv  B
0   5    NaN  2
1  12    5.0  0
2  14   12.0  0
3  22   14.0  0
4  20   22.0  1
5  33   20.0  0
6  11   33.0  1
7   8   11.0  1
8  15    8.0  0
9  11   15.0  1

The most intuitive way is to iterate trough the lines testing all the three conditions in a single-line if-else (as B(i) is 1 for all the true conditions).最直观的方法是在单行if-else中遍历测试所有三个条件的行(因为对于所有真实条件,B(i) 为 1)。

import pandas as pd

df = pd.DataFrame({'A':[5,12,14,22,20,33,11,8,15,11]})
B = [2]
for i in range(1,len(df['A'])):
    newvalue = 1 if (df['A'][i-1]-df['A'][i]>=0 and df['A'][i]<=10) or (df['A'][i-1]-df['A'][i]>=2 and df['A'][i]>10 and df['A'][i]<=20) or (df['A'][i-1]-df['A'][i]>=5 and df['A'][i]>20) else 0
    B.append(newvalue)
df['B'] = B
print(df)

Output: Output:

    A   B
0   5   2
1   12  0
2   14  0
3   22  0
4   20  1
5   33  0
6   11  1
7   8   1
8   15  0
9   11  1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据另一个 dataframe 中的条件在 dataframe 中创建新列? - how to create a new column in a dataframe based on conditions in another dataframe? 根据多个 IF 条件创建具有新 ID 的列 - Create column with new IDs based on multiple IF Conditions 根据多个组合条件创建新列 - Create new column based on multiple groupby conditions 如何根据多个条件在df中创建新列? - How to create new column in a df based on multiple conditions? 如何根据多个条件在 pandas df 中创建一个新列? - How to create a new column in a pandas df based on multiple conditions? 如何按多列分组并根据Python中的条件创建新列? - How to group by multiple columns and create a new column based on conditions in Python? 如何使用基于 2 列的多个条件在 pandas 中创建新列? - How to use multiple conditions based on 2 columns to create the new column in pandas? 如何根据条件在 dataframe 中创建一个新列? - how to create a new column in a dataframe based on conditions? 如何根据一组条件在 PANDAS 中创建一个新列,然后将新列设置为另一个字段的值 - How can I create a new column in PANDAS based on a set of conditions and then setting the new column to the value of another field 根据一列中的多个条件创建一个新列 - create a new column based on multiple conditions in one column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM