[英]R, linear regression, lm, approx
I would like use the linear regression to estimate the Concentration from the count using linear regression this is a sample of my dataset: 我想使用线性回归从使用线性回归的计数估算浓度,这是我的数据集的一个样本:
Concentration count#0
Ctcf 3153
Err 2228
Nkx3-2 4
Isl/ 6
Engrailed 10
Dr 14
Usf 461
Dach1/Dac 4185
POS_C(8) 139664 1143
POS_A(128) 2234624 8897
POS_F(0.125) 2182 20
POS_D(2) 34916 220
POS_B(32) 558656 3359
POS_E(0.5) 8729 21
I am wondering if is better to use lm
and then predict or to use approx
. 我想知道使用
lm
然后预测还是使用approx
更好。 and approxfun
? 和
approxfun
? I am not an expert in Statistics and I didn't find any explanation on Internet. 我不是统计学专家,在Internet上也找不到任何解释。 Thanks!
谢谢!
lm
is what you use if you want to fit an ordinary linear regression (LR). 如果要拟合普通线性回归(LR),则使用
lm
。 If you believe that your response can be well described by a linear combination of your predictors then LR might be appropriate. 如果您认为您的反应可以通过预测变量的线性组合很好地描述,那么LR可能是合适的。 You don't need the data to be normal for an LR to work but you do need (approximate) normality if you're going to compute test statistics for the parameters and such.
您不需要数据就可以正常运行LR,但是如果要计算参数等的测试统计信息,则需要(近似)正态性。 Also if you're interested in inference and coefficient interpretation don't forget to check the usual diagnostics (residuals have mean 0, common variance and no trends, outliers, multicollinearity, normality, etc).
同样,如果您对推理和系数解释感兴趣,请不要忘记检查常规诊断(残差均值为0,共同方差且无趋势,离群值,多重共线性,正态性等)。
The actual model for an LR is Y = X %*% beta + e
where Y
, beta
, and e
are vectors, X
is a matrix, and %*%
denotes matrix multiplication. LR的实际模型是
Y = X %*% beta + e
,其中Y
, beta
和e
是向量, X
是矩阵, %*%
表示矩阵乘法。 This notation assumes that the first column of X
is all 1's. 该表示法假设
X
的第一列全为1。 By default lm
uses a QR decomposition which allows it to avoid computing the inverse of t(X) %*% X
and even t(X) %*% X
, which is a big time saver if X
is large. 缺省情况下,
lm
使用QR分解,从而避免计算t(X) %*% X
甚至t(X) %*% X
的倒数,如果X
大,则可以节省大量时间。
lm
finds [but not by direct computation] solve(t(X) %*% X) %*% t(X) %*% Y
which gives us the unique (provided X
is full rank) estimate of beta
. lm
发现[但不是直接计算] solve(t(X) %*% X) %*% t(X) %*% Y
,这为我们提供了beta
的唯一估计(假设X
为满秩)。
You definitely do not want to use anything else if a plain LR is all you want. 如果您只想使用普通LR,则您绝对不希望使用其他任何东西。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.