简体   繁体   English

keras.标准化列

[英]keras.Normalization column wise

I want to add a Normalization layer to my keras model. And I am testing it in a simpler example, but I don't understand the results.我想为我的 keras model 添加一个规范化层。我在一个更简单的例子中测试它,但我不明白结果。

I did a simple test:我做了一个简单的测试:

normalizer = Normalization(axis=-1)
normalizer.adapt(x_train[:3])
print(x_train[:3])
print(normalizer(x_train[:3]))

And I got following results:我得到了以下结果:

[[ 82.83  31.04  47.   151.    17.88   0.    58.  ]
 [ 59.71  19.01  50.   141.     6.08   0.    60.  ]
 [133.33  62.68  84.   279.    15.17   0.    65.  ]]
tf.Tensor(
[[-0.2968958  -0.3549137  -0.79461485 -0.62603205  0.95840394  0.
  -1.0190493 ]
 [-1.0490034  -1.0080925  -0.6158265  -0.7851927  -1.3798107   0.
  -0.3396831 ]
 [ 1.3458993   1.3630061   1.4104416   1.411225    0.42140734  0.
   1.3587323 ]], shape=(3, 7), dtype=float32)

My question is: if the element in the third row, first column is the maximum of its column, shouldn't it be 1 in the normalized output?我的问题是:如果第三行第一列中的元素是其列中的最大值,它不应该在归一化的 output 中为 1 吗?

UPDATE更新

It is clear, I was confusing with min_max scale.很明显,我对 min_max 比例感到困惑。

Now, I have the issue that if I use adapt with the whole training dataset:现在,我遇到的问题是,如果我对整个训练数据集使用 adapt:

normalizer = Normalization(axis=-1)
normalizer.adapt(x_train)
print(x_train[:3])
print(normalizer(x_train[:3]))

Then, the second column always gives me nan value:然后,第二列总是给我 nan 值:

[[ 82.83  31.04  47.   151.    17.88   0.    58.  ]
 [ 59.71  19.01  50.   141.     6.08   0.    60.  ]
 [133.33  62.68  84.   279.    15.17   0.    65.  ]]
tf.Tensor(
[[-0.51946616         nan -1.4330941  -0.5569647   0.8550693  -0.05900022
  -0.17098609]
 [-1.3537331          nan -1.2127512  -0.62386954 -0.8509362  -0.05900022
  -0.1282853 ]
 [ 1.3027862          nan  1.2844696   0.29941723  0.4632664  -0.05900022
  -0.02153332]], shape=(3, 7), dtype=float32)

Why that column has nan value?为什么该列具有 nan 值?

You might be confusing this layer with a min-max scaling.您可能会将这一层与最小-最大缩放混淆。 The docs clearly state that: 文档清楚 state :

This layer will shift and scale inputs into a distribution centered around 0 with standard deviation 1. It accomplishes this by precomputing the mean and variance of the data, and calling (input - mean) / sqrt(var) at runtime该层会将输入移动并缩放到以 0 为中心、标准差为 1 的分布中。它通过预先计算数据的均值和方差,并在运行时调用 (input - mean) / sqrt(var) 来实现这一点

import tensorflow as tf

normalizer = tf.keras.layers.Normalization(axis=-1)
x_train = tf.constant([[82.83, 31.04, 47., 151., 17.88, 0., 58.],
                        [59.71, 19.01, 50., 141., 6.08, 0., 60.],
                        [133.33, 62.68, 84., 279., 15.17, 0., 65.]])
normalizer.adapt(x_train)
norm_x = normalizer(x_train)
print(tf.reduce_mean(norm_x), tf.math.reduce_std(norm_x))
tf.Tensor(6.81196e-08, shape=(), dtype=float32) tf.Tensor(0.9258201, shape=(), dtype=float32)

With more data, you should come close to mean 0 and std 1. Check this post for min-max scaling.有了更多数据,您应该接近均值 0 和标准差 1。查看此帖子以了解最小-最大缩放比例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM