[英]Create dummy DataFrame based on conditions
我尝试基于具有多个条件的DataFrame
df
创建一个虚拟DataFrame
df_dummy
。
df:
ID1 ID2 ID3
Date
2022-01-01 -1.0 -0.1 0.0
2022-01-02 0.0 1.2 0.7
2022-01-03 NaN 2.0 1.0
2022-01-04 -0.8 0.0 0.0
2022-01-05 1.1 NaN -0.5
df_dummy:
ID1 ID2 ID3
Date
2022-01-01 0 0 NaN
2022-01-02 NaN 1 1
2022-01-03 NaN 1 1
2022-01-04 NAN NaN NaN
2022-01-05 1 NaN 0
我试图像这样为假人定义一个信号:
def signal(x):
if(x>0):
return 1
elif(x<0):
return 0
else:
return np.nan
df_dummy = df[:].apply(lambda x: signal, axis=1)
data_signal = df[:].apply(lambda x: 1 if x>0 -1 if x<0 else np.nan, axis=1)
有没有一种直观的方法可以为df_dummy
创建这样的条件?
非常感谢!
您可以使用np.select
:
# np.select returns a numpy array
# so we copy data to reserve index/columns
df_dummy = df.copy()
df_dummy[:] = np.select((df > 0, df < 0), (1, 0), np.nan)
还:
df_dummy = pd.DataFrame(np.select((df > 0, df < 0), (1, 0), np.nan),
index=df.index, columns=df.columns)
输出:
ID1 ID2 ID3
Date
2022-01-01 0.0 0.0 NaN
2022-01-02 NaN 1.0 1.0
2022-01-03 NaN 1.0 1.0
2022-01-04 0.0 NaN NaN
2022-01-05 1.0 NaN 0.0
使用熊猫:
df_dummy = df.gt(0).astype(int).mask(df.isna()|df.eq(0))
使用numpy.sign
:
df_dummy = np.sign(df).replace(0,np.nan).clip(0)
输出:
ID1 ID2 ID3
Date
2022-01-01 0.0 0.0 NaN
2022-01-02 NaN 1.0 1.0
2022-01-03 NaN 1.0 1.0
2022-01-04 0.0 NaN NaN
2022-01-05 1.0 NaN 0.0
您可以将applymap
与您创建的signal
一起使用
df_dummy = df.applymap(signal)
print(df_dummy)
输出
ID1 ID2 ID3
Date
2022-01-01 0.0 0.0 NaN
2022-01-02 NaN 1.0 1.0
2022-01-03 NaN 1.0 1.0
2022-01-04 0.0 NaN NaN
2022-01-05 1.0 NaN 0.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.