简体   繁体   English

Pandas 有条件地创建新的数据框列

[英]Pandas conditional creation of a new dataframe column

This question is an extension of Pandas conditional creation of a series/dataframe column .这个问题是Pandas 条件创建 series/dataframe column的扩展。 If we had this dataframe:如果我们有这个数据框:

    Col1       Col2
1    A          Z
2    B          Z           
3    B          X
4    C          Y
5    C          W

and we wanted to do the equivalent of:我们想做的相当于:

if Col2 in ('Z','X') then Col3 = 'J' 
else if Col2 = 'Y' then Col3 = 'K'
else Col3 = {value of Col1}

How could I do that?我怎么能那样做?

You can use loc with isin and last fillna :您可以使用locisin和最后fillna

df.loc[df.Col2.isin(['Z','X']), 'Col3'] = 'J'
df.loc[df.Col2 == 'Y', 'Col3'] = 'K'
df['Col3'] = df.Col3.fillna(df.Col1)
print (df)
  Col1 Col2 Col3
1    A    Z    J
2    B    Z    J
3    B    X    J
4    C    Y    K
5    C    W    C

Try this use np.where : outcome = np.where(condition, true, false)试试这个使用 np.where : result outcome = np.where(condition, true, false)

  df["Col3"] = np.where(df['Col2'].isin(['Z','X']), "J", np.where(df['Col2'].isin(['Y']), 'K', df['Col1']))

  Col1 Col2 Col3
1    A    Z    J
2    B    Z    J
3    B    X    J
4    C    Y    K
5    C    W    C

A simple (but likely inefficient) way can be useful when you have multiple if condition.当您有多个 if 条件时,一种简单(但可能效率低下)的方法可能很有用。 Like you are trying to put values into (say) four buckets based on quartiles.就像您试图根据四分位数将值放入(例如)四个桶中一样。

df holds your data, col1 has the values, col2 should have the bucketized values (1,2,3,4) quart has the 25%, 50% and 75% bounds. df 保存您的数据,col1 具有值,col2 应具有分桶值 (1,2,3,4) 夸脱具有 25%、50% 和 75% 的界限。 try this试试这个

  1. create a dummy list as dummy = []创建一个虚拟列表作为 dummy = []
  2. iterate through the data frame with: for index, row in df.iterrows():使用以下方法遍历数据框:对于索引,df.iterrows() 中的行:
  3. Set up the if conditions like: if row[col1] <= quart[0]:#25%设置 if 条件,如:if row[col1] <= quart[0]:#25%
  4. append proper value in dummy under the if: dummy.append(1)在 if 下的 dummy 中附加适当的值:dummy.append(1)
  5. the nested if-elif can take care of all the needed optional values which you append to dummy.嵌套的 if-elif 可以处理您附加到 dummy 的所有需要​​的可选值。
  6. add dummy as a column: df[col2] = dummy将虚拟添加为列:df[col2] = 虚拟

You can find the quartiles via A = df.describe() and then print(A[col1])您可以通过 A = df.describe() 找到四分位数,然后 print(A[col1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM