根據其他列的條件創建新列

Question

我有一個 DataFrame，其中的列由一些值和 NaN 組成，其中沒有為特定列分配值。

import pandas as pd
df = pd.DataFrame({'id': [10, 46, 75, 12, 99, 84],
                   'col1': ['Nan',         
                            15,
                            'Nan',
                            14,
                            'NaN',
                            'NaN'],
                   'col2': ['NaN', 'NaN', 'NaN', 12, 876, 4452],
                   'col3': ['NaN', 11, 13, 546, 9897, 1]
                                   })
                                
df

使用以下輸出：

id  col1    col2    col3
0   10  Nan NaN NaN
1   46  15  NaN 11
2   75  Nan NaN 13
3   12  14  12  546
4   99  NaN 876 9897
5   84  NaN 4452  1

我的目標是創建一個新列（col4），其中所有三列（col1、col2、col3）都具有 NaN 的所有行都顯示為“原始”，否則為“引用”。 我嘗試了 np.where 方法（如下所示），但它不起作用，因為 'NaN' （可能）沒有作為數值被拾取。

df['col4'] = np.where((df['col1'] == 'NaN') & (df['col2'] == 'NaN') & (df['col3'] == 'NaN'), 'original', 'referenced')

我在 Python 方面不是那么先進，也想不出替代方案應該是什么。

Answer 1

如果缺失，使用DataFrame.isna測試所有列，然后使用DataFrame.all測試每行是否全部為真：

#If necessary
import numpy as np

df  = df.replace(['Nan', 'NaN'], np.nan)

df['col4'] = np.where(df[['col1','col2','col3']].isna().all(1), 'original', 'referenced')

您使用Series.isna的解決方案：

df['col4'] = np.where(df['col1'].isna() & df['col2'].isna() & df['col3'].isna(), 
                     'original', 'referenced')

Answer 2

您應該先替換字符串NaN或Nan

df = df.replace('(?i)nan', 'NaN', regex=True)
df['col4'] = np.where(df.filter(like='col').eq('NaN').all(axis=1), 'original', 'referenced')

# or

df = df.replace('(?i)nan', pd.NA, regex=True)
df['col4'] = np.where(df.filter(like='col').isna().all(axis=1), 'original', 'referenced')

print(df)

   id col1  col2  col3        col4
0  10  NaN   NaN   NaN    original
1  46   15   NaN    11  referenced
2  75  NaN   NaN    13  referenced
3  12   14    12   546  referenced
4  99  NaN   876  9897  referenced
5  84  NaN  4452     1  referenced

根據其他列的條件創建新列

問題描述

2 個解決方案

解決方案1
2 已采納 2022-05-26 11:07:53

解決方案2
0 2022-05-26 11:10:49

根據其他列的條件創建新列

問題描述

2 個解決方案

解決方案1 2 已采納 2022-05-26 11:07:53

解決方案2 0 2022-05-26 11:10:49

解決方案1
2 已采納 2022-05-26 11:07:53

解決方案2
0 2022-05-26 11:10:49