简体   繁体   English

Python 基于另一个变量在 dataframe 中生成 dummy

[英]Python Generate dummy in dataframe based on another variable

I have dataframe with many variables.我有 dataframe 有很多变量。 I would like to generate a dummy variable based on column 1, for example.例如,我想根据第 1 列生成一个虚拟变量。 If column 1's observation is NaN, then the dummy variable is filled with 0. If column 1' observation is not missing, then the dummy variable is filled with 1. Any ideas?如果第 1 列的观察值是 NaN,则虚拟变量用 0 填充。如果第 1 列的观察值没有丢失,则虚拟变量用 1 填充。有什么想法吗? Thanks a lot.非常感谢。

This is the easiest way:这是最简单的方法:

# sample data
import pandas as pd 
import numpy as np
df = pd.DataFrame()
df['sample'] = [1,2,np.nan,4,5,np.nan]

# create dummy column
df['dummy'] = np.where(df['sample'].isna(),0,1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM