[英]Multivariate Normal Distribution fitting dataset
I was reading a few papers about RNN networks.我正在阅读一些关于 RNN 网络的论文。 At some point, I came accross the following explanations:
在某些时候,我遇到了以下解释:
The prediction model trained on sN is used to compute the error vectors for each point in the validation and test sequences.
在 sN 上训练的预测 model 用于计算验证和测试序列中每个点的误差向量。 The error vectors are modelled to fit a multivariate Gaussian distribution N = N (μ, Σ).
对误差向量进行建模以拟合多元高斯分布 N = N (μ, Σ)。 The likelihood p(t) of observing an error vector e(t) is given by the value of N at e(t) (similar to normalized innovations squared (NIS) used for novelty detection using Kalman filter based dynamic prediction model [5]).
观察误差向量 e(t) 的可能性 p(t) 由 e(t) 处的 N 值给出(类似于使用基于卡尔曼滤波器的动态预测 model [5] 进行新颖性检测的归一化创新平方 (NIS) )。 The error vectors for the points from vN1 are used to estimate the parameters μ and Σ using Maximum Likelihood Estimation.
来自 vN1 的点的误差向量用于使用最大似然估计来估计参数 μ 和 Σ。
And:和:
A Multivariate Gaussian Distribution is fitted to the error vectors on the validation set.
将多元高斯分布拟合到验证集上的误差向量。 y (t) is the probability of an error vector e (t) after applying Multivariate Gaussian Distribution N = N (µ, ±).
y (t) 是应用多元高斯分布 N = N (µ, ±) 后误差向量 e (t) 的概率。 Maximum Likelihood Estimation is used to select the parameters µ and Σ for the points from vN.
最大似然估计用于 select 参数 µ 和 Σ 用于来自 vN 的点。
vN or vN1 are validaton datasets. vN 或 vN1 是验证数据集。 sN is the training dataset.
sN 是训练数据集。
They are from 2 different articles but describe the same thing.它们来自两篇不同的文章,但描述的是同一件事。 I didn't really understand what they mean by fitting a Multivariate Gaussian Distribution to the data.
通过将多元高斯分布拟合到数据中,我并没有真正理解它们的含义。 What does it mean?
这是什么意思?
Many thanks,非常感谢,
Guillaume纪尧姆
Let's start with one dimensional data first.让我们先从一维数据开始。 If you have a data distributed in a 1D line, they have a mean (µ) and variance (sigma).
如果您的数据分布在一维线中,则它们具有均值 (µ) 和方差 (sigma)。 Then modeling them is as simple as having
(µ, sigma)
to generate a new data point following your main distribution.然后对它们进行建模就像使用
(µ, sigma)
一样简单,以根据您的主要分布生成一个新数据点。
# Generating a new_point in a 1D Gaussian distribution
import random
mu, sigma = 1, 1.6
new_point = random.gauss(mu, sigma)
# 2.797757476598497
Now in N
dimensional space, multivariate normal distribution is a generalization of the one-dimensional.现在在
N
维空间中,多元正态分布是一维的推广。 The objective in general is to find N
averages µ
and N x N
covariances this time noted by Σ
to model all data points in the N
dimensional space.通常的目标是找到
N
个平均值µ
和N x N
协方差,这次由Σ
到 model 记下N
维空间中的所有数据点。 Having them, you are able to generate as many random data points as you want following the main distributions.拥有它们,您可以根据主要分布生成任意数量的随机数据点。 In Python/ Numpy, you can do it like:
在 Python/Numpy 中,您可以这样做:
import numpy as np
new_data_point = np.random.multivariate_normal(mean, covariance, 1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.