简体   繁体   English

根据条件创建虚拟 DataFrame

[英]Create dummy DataFrame based on conditions

I try to create a dummy DataFrame df_dummy based on a DataFrame df with several conditions.我尝试基于具有多个条件的DataFrame df创建一个虚拟DataFrame df_dummy

  • if value > 0 --> 1如果值 > 0 --> 1
  • if value < 0 --> 0如果值 < 0 --> 0
  • else (0, NaN) --> 0否则 (0, NaN) --> 0
df:
            ID1     ID2     ID3
Date            
2022-01-01  -1.0    -0.1    0.0
2022-01-02  0.0     1.2     0.7
2022-01-03  NaN     2.0     1.0
2022-01-04  -0.8    0.0     0.0
2022-01-05  1.1     NaN     -0.5

df_dummy:
            ID1     ID2     ID3
Date            
2022-01-01  0       0       NaN
2022-01-02  NaN     1       1
2022-01-03  NaN     1       1
2022-01-04  NAN     NaN     NaN
2022-01-05  1       NaN     0

I tried to define a signal for the dummy like that:我试图像这样为假人定义一个信号:

def signal(x):
    if(x>0): 
        return 1
    elif(x<0):
        return 0
    else:
        return np.nan
df_dummy = df[:].apply(lambda x: signal, axis=1)

data_signal = df[:].apply(lambda x: 1 if x>0 -1 if x<0 else np.nan, axis=1)

Is there an intuitive way to create such conditions for the df_dummy ?有没有一种直观的方法可以为df_dummy创建这样的条件?

Thanks a lot!非常感谢!

You can use np.select :您可以使用np.select

# np.select returns a numpy array
# so we copy data to reserve index/columns
df_dummy = df.copy()
df_dummy[:] = np.select((df > 0, df < 0), (1, 0), np.nan)

Also:还:

df_dummy = pd.DataFrame(np.select((df > 0, df < 0), (1, 0), np.nan),
                        index=df.index, columns=df.columns)

Output:输出:

            ID1  ID2  ID3
Date                     
2022-01-01  0.0  0.0  NaN
2022-01-02  NaN  1.0  1.0
2022-01-03  NaN  1.0  1.0
2022-01-04  0.0  NaN  NaN
2022-01-05  1.0  NaN  0.0

Using pandas:使用熊猫:

df_dummy = df.gt(0).astype(int).mask(df.isna()|df.eq(0))

with numpy.sign :使用numpy.sign

df_dummy = np.sign(df).replace(0,np.nan).clip(0)

output:输出:

            ID1  ID2  ID3
Date                     
2022-01-01  0.0  0.0  NaN
2022-01-02  NaN  1.0  1.0
2022-01-03  NaN  1.0  1.0
2022-01-04  0.0  NaN  NaN
2022-01-05  1.0  NaN  0.0

You can use applymap with signal you created您可以将applymap与您创建的signal一起使用

df_dummy = df.applymap(signal)
print(df_dummy)

Output输出

            ID1  ID2  ID3
Date                     
2022-01-01  0.0  0.0  NaN
2022-01-02  NaN  1.0  1.0
2022-01-03  NaN  1.0  1.0
2022-01-04  0.0  NaN  NaN
2022-01-05  1.0  NaN  0.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM