How to create a column of dummies variables in Pandas?

Question

I have a column of time-series data looks like this

TimeStamp               Data
2002-01-01 00:00:00     0.00120 
2002-01-01 08:00:00     0.00070 
2002-01-01 12:00:00     0.00000 
2002-01-01 16:00:00    -0.00440 
...
2003-01-01 12:00:00     0.00220 
2003-01-01 16:00:00    -0.00440

In general, there are positive, negative and 0.00000 in the column. I would like to add a dummy column that all positive numbers represented by 1, negative by 0, and 0.00000 by 2. I can do this with a loop, but it doesn't seem a smart idea if I am using Pandas.

Could any one tell me the proper way of doing this in Pandas? Thank you!

Answer 1

You could do something like this:

# initialise a column named sign
df["sign"] = [0]*df.shape[0]

# apply to all cases
df.loc[df["Data"] < 0, "sign"] = 0
df.loc[df["Data"] > 0, "sign"] = 1
df.loc[df["Data"] == 0, "sign"] = 2

Answer 2

There's np.sign which gives 1,0,-1 for +,0,- if it works for you:

df['sign'] = np.sign(df['Data'])

Answer 3

You can use numpy select :

df['dummy'] = np.select((df.Data<0, df.Data>0), (0,1), 2)

Answer 4

I believe this should work.

df.loc[df['Data']>0,'Dummy Column'] = 1
df.loc[df['Data']<0,'Dummy Column'] = 0
df.loc[df['Data']==0,'Dummy Column'] = 2

How to create a column of dummies variables in Pandas?

Question

4 answers

solution1
1 ACCPTED 2020-06-25 16:02:23

solution2
1 2020-06-25 16:04:16

solution3
1 2020-06-25 16:04:41

solution4
1 2020-06-25 16:06:50

How to create a column of dummies variables in Pandas?

Question

4 answers

solution1 1 ACCPTED 2020-06-25 16:02:23

solution2 1 2020-06-25 16:04:16

solution3 1 2020-06-25 16:04:41

solution4 1 2020-06-25 16:06:50

solution1
1 ACCPTED 2020-06-25 16:02:23

solution2
1 2020-06-25 16:04:16

solution3
1 2020-06-25 16:04:41

solution4
1 2020-06-25 16:06:50