Python 基于另一个变量在 dataframe 中生成 dummy

Question

I have dataframe with many variables.我有 dataframe 有很多变量。 I would like to generate a dummy variable based on column 1, for example.例如，我想根据第 1 列生成一个虚拟变量。 If column 1's observation is NaN, then the dummy variable is filled with 0. If column 1' observation is not missing, then the dummy variable is filled with 1. Any ideas?如果第 1 列的观察值是 NaN，则虚拟变量用 0 填充。如果第 1 列的观察值没有丢失，则虚拟变量用 1 填充。有什么想法吗？ Thanks a lot.非常感谢。

Answer 1

This is the easiest way:这是最简单的方法：

# sample data
import pandas as pd 
import numpy as np
df = pd.DataFrame()
df['sample'] = [1,2,np.nan,4,5,np.nan]

# create dummy column
df['dummy'] = np.where(df['sample'].isna(),0,1)

Python 基于另一个变量在 dataframe 中生成 dummy

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-03-15 16:17:59

Python 基于另一个变量在 dataframe 中生成 dummy

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-03-15 16:17:59

解决方案1
1 已采纳 2021-03-15 16:17:59