Label smoothing (soft targets) in Pandas

Question

In Pandas there is get_dummies method that one-hot encodes categorical variable. Now I want to do label smoothing as described in section 7.5.1 of Deep Learning book:

Label smoothing regularizes a model based on a softmax with k output values by replacing the hard 0 and 1 classification targets with targets of eps / k and 1 - (k - 1) / k * eps , respectively.

What would be the most efficient and/or elegant way to do label smothing in Pandas dataframe?

Answer 1

First, lets use much simpler equation ( ϵ denotes how much probability mass you move from "true label" and distribute to all remaining ones).

1 -> 1 - ϵ
0 -> ϵ / (k-1)

You can simply use nice mathematical property of the above, since all you have to do is

x -> x * (1 - ϵ) + (1-x) * ϵ / (k-1)

thus if your dummy columns are a, b, c, d just do

indices = ['a', 'b', 'c', 'd']
eps = 0.1
df[indices] = df[indices] * (1 - eps) + (1-df[indices]) * eps / (len(indices) - 1)

which for

>>> df
   a  b  c  d
0  1  0  0  0
1  0  1  0  0
2  0  0  0  1
3  1  0  0  0
4  0  1  0  0
5  0  0  1  0

returns

        a         b         c         d
0  0.900000  0.033333  0.033333  0.033333
1  0.033333  0.900000  0.033333  0.033333
2  0.033333  0.033333  0.033333  0.900000
3  0.900000  0.033333  0.033333  0.033333
4  0.033333  0.900000  0.033333  0.033333
5  0.033333  0.033333  0.900000  0.033333

as expected.

Label smoothing (soft targets) in Pandas

Question

1 answers

solution1
8 ACCPTED 2016-09-05 18:11:33

Label smoothing (soft targets) in Pandas

Question

1 answers

solution1 8 ACCPTED 2016-09-05 18:11:33

solution1
8 ACCPTED 2016-09-05 18:11:33