感知器的几何表示（人工神经网络）

Question

I am taking this course on Neural networks in Coursera by Geoffrey Hinton (not current).我正在 Coursera 中学习 Geoffrey Hinton 的这门关于神经网络的课程（不是最新的）。

I have a very basic doubt on weight spaces.我对权重空间有一个非常基本的怀疑。 https://d396qusza40orc.cloudfront.net/neuralnets/lecture_slides%2Flec2.pdf Page 18. https://d396qusza40orc.cloudfront.net/neuralnets/lecture_slides%2Flec2.pdf第 18 页。 在此处输入图片说明

If I have a weight vector (bias is 0) as [w1=1,w2=2] and training case as {1,2,-1} and {2,1,1} where I guess {1,2} and {2,1} are the input vectors.如果我有一个权重向量（偏差为 0）为 [w1=1,w2=2] 和训练案例为 {1,2,-1} 和 {2,1,1} 我猜 {1,2} 和{2,1} 是输入向量。 How can it be represented geometrically?如何用几何表示？

I am unable to visualize it?我无法想象它？ Why is training case giving a plane which divides the weight space into 2?为什么训练案例给出了一个将权重空间划分为 2 的平面？ Could somebody explain this in a coordinate axes of 3 dimensions?有人可以在 3 维坐标轴中解释这一点吗？

The following is the text from the ppt:以下是ppt中的文字：

1.Weight-space has one dimension per weight. 1.重量空间每个重量有一个维度。

2.A point in the space has particular setting for all the weights. 2.空间中的一个点对所有的权重都有特定的设置。

3.Assuming that we have eliminated the threshold each hyperplane could be represented as a hyperplane through the origin. 3.假设我们已经消除了阈值，每个超平面都可以表示为一个通过原点的超平面。

My doubt is in the third point above.我的疑问是在上面的第三点。 Kindly help me understand.请帮助我理解。

Answer 1

It's probably easier to explain if you look deeper into the math.如果您更深入地研究数学，可能更容易解释。 Basically what a single layer of a neural net is performing some function on your input vector transforming it into a different vector space.基本上，神经网络的单层正在对您的输入向量执行某些功能，将其转换为不同的向量空间。

You don't want to jump right into thinking of this in 3-dimensions.您不想直接从 3 维中考虑这个问题。 Start smaller, it's easy to make diagrams in 1-2 dimensions, and nearly impossible to draw anything worthwhile in 3 dimensions (unless you're a brilliant artist), and being able to sketch this stuff out is invaluable.从小处着手，在 1-2 维上绘制图表很容易，而在 3 维上绘制任何有价值的东西几乎是不可能的（除非您是一位出色的艺术家），并且能够勾勒出这些东西是无价的。

Let's take the simplest case, where you're taking in an input vector of length 2, you have a weight vector of dimension 2x1 , which implies an output vector of length one (effectively a scalar)让我们以最简单的情况为例，其中输入长度为 2 的输入向量，您有一个维度为2x1的权重向量，这意味着长度为 1 的输出向量（实际上是标量）

In this case it's pretty easy to imagine that you've got something of the form:在这种情况下，很容易想象你有这样的形式：

input = [x, y]
weight = [a, b]
output = ax + by

If we assume that weight = [1, 3] , we can see, and hopefully intuit that the response of our perceptron will be something like this:如果我们假设weight = [1, 3] ，我们可以看到，并希望凭直觉我们的感知器的响应将是这样的： 在此处输入图片说明

With the behavior being largely unchanged for different values of the weight vector.对于不同的权重向量值，行为在很大程度上没有改变。

It's easy to imagine then, that if you're constraining your output to a binary space, there is a plane, maybe 0.5 units above the one shown above that constitutes your "decision boundary".那么很容易想象，如果你将你的输出限制在一个二进制空间，有一个平面，可能比上面显示的构成你的“决策边界”的平面高 0.5 个单位。

As you move into higher dimensions this becomes harder and harder to visualize, but if you imagine that that plane shown isn't merely a 2-d plane, but an nd plane or a hyperplane, you can imagine that this same process happens.随着您进入更高的维度，这变得越来越难以想象，但是如果您想象显示的平面不仅仅是一个二维平面，而是一个 nd 平面或一个超平面，您可以想象同样的过程会发生。

Since actually creating the hyperplane requires either the input or output to be fixed, you can think of giving your perceptron a single training value as creating a "fixed" [x,y] value.由于实际创建超平面需要固定输入或输出，因此您可以将单个训练值视为创建“固定” [x,y]值。 This can be used to create a hyperplane.这可用于创建超平面。 Sadly, this cannot be effectively be visualized as 4-d drawings are not really feasible in browser.遗憾的是，这无法有效地可视化，因为 4 维绘图在浏览器中并不真正可行。

Hope that clears things up, let me know if you have more questions.希望能解决问题，如果您有更多问题，请告诉我。

Answer 2

I have encountered this question on SO while preparing a large article on linear combinations (it's in Russian, https://habrahabr.ru/post/324736/ ).我在准备一篇关于线性组合的大文章时遇到了这个问题（它是俄语， https://habrahabr.ru/post/324736/ ）。 It has a section on the weight space and I would like to share some thoughts from it.它有一个关于权重空间的部分，我想分享一些想法。

Let's take a simple case of linearly separable dataset with two classes, red and green:让我们举一个简单的线性可分数据集案例，它有两个类，红色和绿色：

The illustration above is in the dataspace X, where samples are represented by points and weight coefficients constitutes a line.上图是在数据空间 X 中，其中样本由点表示，权重系数构成一条线。 It could be conveyed by the following formula:可以用以下公式表达：

w^T * x + b = 0 w^T * x + b = 0

But we can rewrite it vice-versa making x component a vector-coefficient and w a vector-variable:但是我们可以重写它，反之亦然，使x分量成为向量系数， w成为向量变量：

x^T * w + b = 0 x^T * w + b = 0

because dot product is symmetrical.因为点积是对称的。 Now it could be visualized in the weight space the following way:现在可以通过以下方式在权重空间中可视化：

where red and green lines are the samples and blue point is the weight.其中红线和绿线是样本，蓝点是权重。

More possible weights are limited to the area below (shown in magenta):更多可能的权重仅限于以下区域（以洋红色显示）：

which could be visualized in dataspace X as:可以在数据空间 X 中可视化为：

Hope it clarifies dataspace/weightspace correlation a bit.希望它能稍微澄清数据空间/权重空间的相关性。 Feel free to ask questions, will be glad to explain in more detail.随意提问，将很乐意更详细地解释。

Answer 3

The "decision boundary" for a single layer perceptron is a plane (hyper plane)单层感知器的“决策边界”是一个平面（超平面）

where n in the image is the weight vector w , in your case w={w1=1,w2=2}=(1,2) and the direction specifies which side is the right side.其中图像中的n是weight向量w ，在您的情况下w={w1=1,w2=2}=(1,2)并且方向指定哪一侧是右侧。 n is orthogonal (90 degrees) to the plane) n与平面正交（90 度））

A plane always splits a space into 2 naturally (extend the plane to infinity in each direction)平面总是自然地将空间一分为二（在每个方向上将平面扩展到无穷大）

you can also try to input different value into the perceptron and try to find where the response is zero (only on the decision boundary).您还可以尝试向感知器输入不同的值并尝试找到响应为零的位置（仅在决策边界上）。

Recommend you read up on linear algebra to understand it better: https://www.khanacademy.org/math/linear-algebra/vectors_and_spaces建议您阅读线性代数以更好地理解它： https : //www.khanacademy.org/math/linear-algebra/vectors_and_spaces

Answer 4

For a perceptron with 1 input & 1 output layer, there can only be 1 LINEAR hyperplane.对于具有 1 个输入和 1 个输出层的感知器，只能有 1 个 LINEAR 超平面。 And since there is no bias , the hyperplane won't be able to shift in an axis and so it will always share the same origin point.并且由于没有bias ，超平面将无法在轴上移动，因此它将始终共享相同的原点。 However, if there is a bias, they may not share a same point anymore.但是，如果存在偏差，他们可能不再共享同一点。

Answer 5

I think the reason why a training case can be represented as a hyperplane because... Let's say [j,k] is the weight vector and [m,n] is the training-input我认为训练案例可以表示为超平面的原因是……假设 [j,k] 是权重向量，而 [m,n] 是训练输入

training-output = jm + kn训练输出 = jm + kn

Given that a training case in this perspective is fixed and the weights varies, the training-input (m, n) becomes the coefficient and the weights (j, k) become the variables.鉴于这个角度的训练案例是固定的，权重是变化的，训练输入 (m, n) 成为系数，权重 (j, k) 成为变量。 Just as in any text book where z = ax + by is a plane, training-output = jm + kn is also a plane defined by training-output, m, and n.就像在任何教科书中 z = ax + by 是一个平面一样，training-output = jm + kn 也是一个由 training-output、m 和 n 定义的平面。

Answer 6

Equation of a plane passing through origin is written in the form:通过原点的平面方程写成如下形式：

ax+by+cz=0

If a=1,b=2,c=3;Equation of the plane can be written as:若a=1,b=2,c=3；平面方程可写为：

x+2y+3z=0

So,in the XYZ plane,Equation: x+2y+3z=0所以，在XYZ平面上，方程： x+2y+3z=0

Now,in the weight space;every dimension will represent a weight.So,if the perceptron has 10 weights,Weight space will be 10 dimensional.现在，在权重空间中，每个维度都代表一个权重。所以，如果感知器有 10 个权重，权重空间将是 10 维。

Equation of the perceptron: ax+by+cz<=0 ==> Class 0感知器方程： ax+by+cz<=0 ==> Class 0

                          ax+by+cz>0  ==> Class 1

In this case;a,b & c are the weights.x,y & z are the input features.在这种情况下；a、b 和 c 是权重。x、y 和 z 是输入特征。

In the weight space;a,b & c are the variables(axis).在权重空间中；a、b 和 c 是变量（轴）。

So,for every training example;for eg: (x,y,z)=(2,3,4);a hyperplane would be formed in the weight space whose equation would be:因此，对于每个训练示例；例如：(x,y,z)=(2,3,4)；将在权重空间中形成一个超平面，其方程为：

2a+3b+4c=0

passing through the origin.通过原点。

I hope,now,you understand it.我希望，现在，你明白了。

Answer 7

Consider we have 2 weights.考虑我们有 2 个权重。 So w = [w1, w2] .所以w = [w1, w2] 。 Suppose we have input x = [x1, x2] = [1, 2] .假设我们有输入x = [x1, x2] = [1, 2] 。 If you use the weight to do a prediction, you have z = w1*x1 + w2*x2 and prediction y = z > 0 ? 1 : 0如果您使用权重进行预测，您有z = w1*x1 + w2*x2和预测y = z > 0 ? 1 : 0 y = z > 0 ? 1 : 0 . y = z > 0 ? 1 : 0 。

Suppose the label for the input x is 1. Thus, we hope y = 1, and thus we want z = w1*x1 + w2*x2 > 0 .假设输入x的标签是 1。因此，我们希望y = 1，因此我们希望z = w1*x1 + w2*x2 > 0 。 Consider vector multiplication, z = (w ^ T)x .考虑向量乘法， z = (w ^ T)x 。 So we want (w ^ T)x > 0 .所以我们想要(w ^ T)x > 0 。 The geometric interpretation of this expression is that the angle between w and x is less than 90 degree.这个表达式的几何解释是w和x之间的夹角小于 90 度。 For example, the green vector is a candidate for w that would give the correct prediction of 1 in this case.例如，绿色向量是w的候选者，在这种情况下会给出正确的预测 1。 Actually, any vector that lies on the same side, with respect to the line of w1 + 2 * w2 = 0 , as the green vector would give the correct solution.实际上，任何位于同一侧的向量，关于w1 + 2 * w2 = 0 ，因为绿色向量会给出正确的解决方案。 However, if it lies on the other side as the red vector does, then it would give the wrong answer.但是，如果它像红色向量一样位于另一侧，那么它会给出错误的答案。 However, suppose the label is 0. Then the case would just be the reverse.但是，假设标签为 0。那么情况正好相反。

The above case gives the intuition understand and just illustrates the 3 points in the lecture slide.上面的案例给出了直觉理解，只是说明了讲座幻灯片中的 3 点。 The testing case x determines the plane, and depending on the label, the weight vector must lie on one particular side of the plane to give the correct answer.测试用例 x 确定平面，并且根据标签，权重向量必须位于平面的特定一侧才能给出正确答案。

感知器的几何表示（人工神经网络）

问题描述

7 个解决方案

解决方案1
8 2014-03-02 05:49:18

解决方案2
5 2017-03-27 09:31:09

解决方案3
3 2014-03-02 04:06:15

解决方案4
2 2016-10-16 18:12:47

解决方案5
2 2017-02-15 21:04:04

解决方案6
2 2018-05-12 00:49:50

解决方案7
0 2019-12-14 14:03:27

感知器的几何表示（人工神经网络）

问题描述

7 个解决方案

解决方案1 8 2014-03-02 05:49:18

解决方案2 5 2017-03-27 09:31:09

解决方案3 3 2014-03-02 04:06:15

解决方案4 2 2016-10-16 18:12:47

解决方案5 2 2017-02-15 21:04:04

解决方案6 2 2018-05-12 00:49:50

解决方案7 0 2019-12-14 14:03:27

解决方案1
8 2014-03-02 05:49:18

解决方案2
5 2017-03-27 09:31:09

解决方案3
3 2014-03-02 04:06:15

解决方案4
2 2016-10-16 18:12:47

解决方案5
2 2017-02-15 21:04:04

解决方案6
2 2018-05-12 00:49:50

解决方案7
0 2019-12-14 14:03:27