基于其他列的值创建新列的更好方法

Question

What is a better way to create the same column mentioned below: 什么是创建下面提到的同一列的更好方法：

col_new = []
for r1 in df['col_A']:
    if r1==1:
        for r2 in df['col_B']:
            if r2!='None':
                col_new.append('col_new')

df['col_new'] = col_new

My dataframe is huge (120k * 22) and running the above code is hanging the notebook. 我的数据帧很大（120k * 22），运行上面的代码使笔记本挂起。 Is there a faster and more efficient way to create this column where it represents all the non-null values of col_B when col_A is 1. 有没有一种更快，更有效的方法来创建此列，该列表示col_A为1时col_B的所有非空值。

Answer 1

I believe need to create boolean mask and then append value by DataFrame.loc : 我相信需要创建布尔掩码，然后通过DataFrame.loc ：

mask = (df['col_A'] == 1) & (df['col_B']!='None')

#if None is not string
#mask = (df['col_A'] == 1) & (df['col_B'].notnull())
df.loc[mask, 'col_new'] = 'col_new'

Sample : 样品：

In column are strings None s: 列中是字符串None ：

df = pd.DataFrame({
    'col_A': [1,1,2,1],
    'col_B': ['a','None','None','a']
})
print (df)
   col_A col_B
0      1     a
1      1  None
2      2  None
3      1     a

mask = (df['col_A'] == 1) & (df['col_B']!='None')
df.loc[mask, 'col_new'] = 'val'
print (df)
   col_A col_B col_new
0      1     a     val
1      1  None     NaN
2      2  None     NaN
3      1     a     val

In column are not strings None s , then use Series.notna : 在列中不是字符串None ，然后使用Series.notna ：

df = pd.DataFrame({
    'col_A': [1,1,2,1],
    'col_B': ['a',None,None,'a']
})
print (df)
   col_A col_B
0      1     a
1      1  None
2      2  None
3      1     a

mask = (df['col_A'] == 1) & (df['col_B'].notna())
#oldier pandas versions
#mask = (df['col_A'] == 1) & (df['col_B'].notnull())
df.loc[mask, 'col_new'] = 'val'
print (df)
   col_A col_B col_new
0      1     a     val
1      1  None     NaN
2      2  None     NaN
3      1     a     val

Also if want use if-else statement numpy.where is really helpfull: 另外，如果要使用if-else语句numpy.where真的numpy.where ：

df['col_new'] = np.where(mask, 'val', 'another_val')
print (df)
   col_A col_B      col_new
0      1     a          val
1      1  None  another_val
2      2  None  another_val
3      1     a          val

基于其他列的值创建新列的更好方法

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-09-14 05:05:35

基于其他列的值创建新列的更好方法

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-09-14 05:05:35

解决方案1
0 已采纳 2018-09-14 05:05:35