[英]How `poly()` generates orthogonal polynomials? How to understand the “coefs” returned?
My understanding of orthogonal polynomials is that they take the form 我对正交多项式的理解是它们采用以下形式
y(x) = a1 + a2(x - c1) + a3(x - c2)(x - c3) + a4(x - c4)(x - c5)(x - c6)... up to the number of terms desired y(x)= a1 + a2(x-c1)+ a3(x-c2)(x-c3)+ a4(x-c4)(x-c5)(x-c6)...所需条款
where a1 , a2 etc are coefficients to each orthogonal term (vary between fits), and c1 , c2 etc are coefficients within the orthogonal terms, determined such that the terms maintain orthogonality (consistent between fits using the same x values) 其中a1 , a2 等是每个正交项的系数(拟合之间的变化),而c1 , c2 等是正交项内的系数,确定为使得各项保持正交性(使用相同x值的拟合之间一致)
I understand poly()
is used to fit orthogonal polynomials. 我知道
poly()
用于拟合正交多项式。 An example 一个例子
x = c(1.160, 1.143, 1.126, 1.109, 1.079, 1.053, 1.040, 1.027, 1.015, 1.004, 0.994, 0.985, 0.977) # abscissae not equally spaced
y = c(1.217395, 1.604360, 2.834947, 4.585687, 8.770932, 9.996260, 9.264800, 9.155079, 7.949278, 7.317690, 6.377519, 6.409620, 6.643426)
# construct the orthogonal polynomial
orth_poly <- poly(x, degree = 5)
# fit y to orthogonal polynomial
model <- lm(y ~ orth_poly)
I would like to extract both the coefficients a1 , a2 etc , as well as the orthogonal coefficients c1 , c2 etc . 我想提取系数a1 , a2 等 ,以及正交系数c1 , c2 等 。 I'm not sure how to do this.
我不确定该怎么做。 My guess is that
我的猜测是
model$coefficients
returns the first set of coefficients, but I'm struggling with how to extract the others. 返回第一组系数,但是我在如何提取其他系数方面苦苦挣扎。 Perhaps within
也许在
attributes(orth_poly)$coefs
? ?
Many thanks. 非常感谢。
I have just realized that there was a closely related question Extracting orthogonal polynomial coefficients from R's poly() function? 我刚刚意识到,有一个密切相关的问题是从R的poly()函数中提取正交多项式系数吗? 2 years ago.
2年前。 The answer there is merely explaining what
predict.poly
does, but my answer gives a complete picture. 那里的答案只是解释了
predict.poly
作用,但我的回答给出了一个完整的图景。
Section 1: How does poly
represent orthogonal polynomials 第1节:
poly
如何代表正交多项式
My understanding of orthogonal polynomials is that they take the form
我对正交多项式的理解是它们采用以下形式
y(x) = a1 + a2(x - c1) + a3(x - c2)(x - c3) + a4(x - c4)(x - c5)(x - c6)... up to the number of terms desired
y(x)= a1 + a2(x-c1)+ a3(x-c2)(x-c3)+ a4(x-c4)(x-c5)(x-c6)...所需条款
No no, there is no such clean form. 不,不,没有这样干净的表格。
poly()
generates monic orthogonal polynomials which can be represented by the following recursion algorithm. poly()
生成单项正交多项式,该多项式可以由以下递归算法表示。 This is how predict.poly
generates linear predictor matrix. 这就是
predict.poly
如何生成线性预测矩阵。 Surprisingly, poly
itself does not use such recursion but use a brutal force: QR factorization of model matrix of ordinary polynomials for orthogonal span. 出人意料的是,
poly
本身并不使用这种递归,而是使用了残酷的力:正交多项式的普通多项式模型矩阵的QR分解。 However, this is equivalent to the recursion. 但是,这等效于递归。
Section 2: Explanation of the output of poly()
第2节:
poly()
输出的说明
Let's consider an example. 让我们考虑一个例子。 Take the
x
in your post, 在帖子中加上
x
,
X <- poly(x, degree = 5)
# 1 2 3 4 5
# [1,] 0.484259711 0.48436462 0.48074040 0.351250507 0.25411350
# [2,] 0.406027697 0.20038942 -0.06236564 -0.303377083 -0.46801416
# [3,] 0.327795682 -0.02660187 -0.34049024 -0.338222850 -0.11788140
# ... ... ... ... ... ...
#[12,] -0.321069852 0.28705108 -0.15397819 -0.006975615 0.16978124
#[13,] -0.357884918 0.42236400 -0.40180712 0.398738364 -0.34115435
#attr(,"coefs")
#attr(,"coefs")$alpha
#[1] 1.054769 1.078794 1.063917 1.075700 1.063079
#
#attr(,"coefs")$norm2
#[1] 1.000000e+00 1.300000e+01 4.722031e-02 1.028848e-04 2.550358e-07
#[6] 5.567156e-10 1.156628e-12
Here is what those attributes are: 这些属性是:
alpha[1]
gives the x_bar = mean(x)
, ie, the centre; alpha[1]
给出x_bar = mean(x)
,即中心; alpha - alpha[1]
gives alpha0
, alpha1
, ..., alpha4
( alpha5
is computed but dropped before poly
returns X
, as it won't be used in predict.poly
); alpha - alpha[1]
给出alpha0
, alpha1
,..., alpha4
( alpha5
计算但在此之前丢弃poly
返回X
,因为它不会在使用predict.poly
); norm2
is always 1. The second to the last are l0
, l1
, ..., l5
, giving the squared column norm of X
; norm2
的第一个值norm2
是1。第二个值是l0
, l1
,..., l5
,给出X
的平方列范数; l0
is the column squared norm of the dropped P0(x - x_bar)
, which is always n
(ie, length(x)
); l0
是丢弃的P0(x - x_bar)
的列平方范数,始终为n
(即length(x)
); while the first 1
is just padded in order for the recursion to proceed inside predict.poly
. 1
以便于递归在predict.poly
。 beta0
, beta1
, beta2
, ..., beta_5
are not returned, but can be computed by norm2[-1] / norm2[-length(norm2)]
. beta0
, beta1
, beta2
,..., beta_5
不会返回,但是可以由norm2[-1] / norm2[-length(norm2)]
。 Section 3: Implementing poly
using both QR factorization and recursion algorithm 第3节:使用QR因式分解和递归算法实现
poly
As mentioned earlier, poly
does not use recursion, while predict.poly
does. 如前所述,
poly
不使用递归,而predict.poly
则使用递归。 Personally I don't understand the logic / reason behind such inconsistent design. 我个人不了解这种不一致的设计背后的逻辑/原因。 Here I would offer a function
my_poly
written myself that uses recursion to generate the matrix, if QR = FALSE
. 在这里,我将提供一个自己编写的函数
my_poly
,如果QR = FALSE
,它将使用递归生成矩阵。 When QR = TRUE
, it is a similar but not identical implementation poly
. 当
QR = TRUE
,它是一个类似但不相同的实现poly
。 The code is very well commented, helpful for you to understand both methods. 该代码的注释非常好,有助于您理解这两种方法。
## return a model matrix for data `x`
my_poly <- function (x, degree = 1, QR = TRUE) {
## check feasibility
if (length(unique(x)) < degree)
stop("insufficient unique data points for specified degree!")
## centring covariates (so that `x` is orthogonal to intercept)
centre <- mean(x)
x <- x - centre
if (QR) {
## QR factorization of design matrix of ordinary polynomial
QR <- qr(outer(x, 0:degree, "^"))
## X <- qr.Q(QR) * rep(diag(QR$qr), each = length(x))
## i.e., column rescaling of Q factor by `diag(R)`
## also drop the intercept
X <- qr.qy(QR, diag(diag(QR$qr), length(x), degree + 1))[, -1, drop = FALSE]
## now columns of `X` are orthorgonal to each other
## i.e., `crossprod(X)` is diagonal
X2 <- X * X
norm2 <- colSums(X * X) ## squared L2 norm
alpha <- drop(crossprod(X2, x)) / norm2
beta <- norm2 / (c(length(x), norm2[-degree]))
colnames(X) <- 1:degree
}
else {
beta <- alpha <- norm2 <- numeric(degree)
## repeat first polynomial `x` on all columns to initialize design matrix X
X <- matrix(x, nrow = length(x), ncol = degree, dimnames = list(NULL, 1:degree))
## compute alpha[1] and beta[1]
norm2[1] <- new_norm <- drop(crossprod(x))
alpha[1] <- sum(x ^ 3) / new_norm
beta[1] <- new_norm / length(x)
if (degree > 1L) {
old_norm <- new_norm
## second polynomial
X[, 2] <- Xi <- (x - alpha[1]) * X[, 1] - beta[1]
norm2[2] <- new_norm <- drop(crossprod(Xi))
alpha[2] <- drop(crossprod(Xi * Xi, x)) / new_norm
beta[2] <- new_norm / old_norm
old_norm <- new_norm
## further polynomials obtained from recursion
i <- 3
while (i <= degree) {
X[, i] <- Xi <- (x - alpha[i - 1]) * X[, i - 1] - beta[i - 1] * X[, i - 2]
norm2[i] <- new_norm <- drop(crossprod(Xi))
alpha[i] <- drop(crossprod(Xi * Xi, x)) / new_norm
beta[i] <- new_norm / old_norm
old_norm <- new_norm
i <- i + 1
}
}
}
## column rescaling so that `crossprod(X)` is an identity matrix
scale <- sqrt(norm2)
X <- X * rep(1 / scale, each = length(x))
## add attributes and return
attr(X, "coefs") <- list(centre = centre, scale = scale, alpha = alpha[-degree], beta = beta[-degree])
X
}
Section 4: Explanation of the output of my_poly
第4节:
my_poly
输出的my_poly
X <- my_poly(x, 5, FALSE)
The resulting matrix is as same as what is generated by poly
hence left out. 生成的矩阵与由
poly
生成的矩阵相同,因此省略了。 The attributes are not the same. 属性不相同。
#attr(,"coefs")
#attr(,"coefs")$centre
#[1] 1.054769
#attr(,"coefs")$scale
#[1] 2.173023e-01 1.014321e-02 5.050106e-04 2.359482e-05 1.075466e-06
#attr(,"coefs")$alpha
#[1] 0.024025005 0.009147498 0.020930616 0.008309835
#attr(,"coefs")$beta
#[1] 0.003632331 0.002178825 0.002478848 0.002182892
my_poly
returns construction information more apparently: my_poly
更明显地返回构造信息:
centre
gives x_bar = mean(x)
; centre
给出x_bar = mean(x)
; scale
gives column norms (the square root of norm2
returned by poly
); scale
给出列范数(由poly
返回的norm2
平方根); alpha
gives alpha1
, alpha2
, alpha3
, alpha4
; alpha
给出alpha1
, alpha2
, alpha3
, alpha4
; beta
gives beta1
, beta2
, beta3
, beta4
. beta
给出beta1
, beta2
, beta3
, beta4
。 Section 5: Prediction routine for my_poly
第5节:
my_poly
预测例程
Since my_poly
returns different attributes, stats:::predict.poly
is not compatible with my_poly
. 由于
my_poly
返回不同的属性,因此stats:::predict.poly
my_poly
与my_poly
不兼容。 Here is the appropriate routine my_predict_poly
: 这是合适的例程
my_predict_poly
:
## return a linear predictor matrix, given a model matrix `X` and new data `x`
my_predict_poly <- function (X, x) {
## extract construction info
coefs <- attr(X, "coefs")
centre <- coefs$centre
alpha <- coefs$alpha
beta <- coefs$beta
degree <- ncol(X)
## centring `x`
x <- x - coefs$centre
## repeat first polynomial `x` on all columns to initialize design matrix X
X <- matrix(x, length(x), degree, dimnames = list(NULL, 1:degree))
if (degree > 1L) {
## second polynomial
X[, 2] <- (x - alpha[1]) * X[, 1] - beta[1]
## further polynomials obtained from recursion
i <- 3
while (i <= degree) {
X[, i] <- (x - alpha[i - 1]) * X[, i - 1] - beta[i - 1] * X[, i - 2]
i <- i + 1
}
}
## column rescaling so that `crossprod(X)` is an identity matrix
X * rep(1 / coefs$scale, each = length(x))
}
Consider an example: 考虑一个例子:
set.seed(0); x1 <- runif(5, min(x), max(x))
and 和
stats:::predict.poly(poly(x, 5), x1)
my_predict_poly(my_poly(x, 5, FALSE), x1)
give exactly the same result predictor matrix: 给出完全相同的结果预测变量矩阵:
# 1 2 3 4 5
#[1,] 0.39726381 0.1721267 -0.10562568 -0.3312680 -0.4587345
#[2,] -0.13428822 -0.2050351 0.28374304 -0.0858400 -0.2202396
#[3,] -0.04450277 -0.3259792 0.16493099 0.2393501 -0.2634766
#[4,] 0.12454047 -0.3499992 -0.24270235 0.3411163 0.3891214
#[5,] 0.40695739 0.2034296 -0.05758283 -0.2999763 -0.4682834
Be aware that prediction routine simply takes the existing construction information rather than reconstructing polynomials. 请注意,预测例程仅采用现有的构造信息,而不是重构多项式。
Section 6: Just treat poly
and predict.poly
as a black box 第6节:只将
poly
并predict.poly
poly
作为一个黑匣子
There is rarely the need to understand everything inside. 几乎不需要了解内部的所有内容。 For statistical modelling it is sufficient to know that
poly
constructs polynomial basis for model fitting, whose coefficients can be found in lmObject$coefficients
. 对于统计建模,只需知道
poly
构造用于模型拟合的多项式基础即可,其系数可以在lmObject$coefficients
找到。 When making prediction, predict.poly
never needs be called by user since predict.lm
will do it for you. 当进行预测,
predict.poly
从来不需要被用户称为自predict.lm
会为你做它。 In this way, it is absolutely OK to just treat poly
and predict.poly
as a black box. 这样,将
poly
和predict.poly
视为黑匣子绝对是可以的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.