R代码运算符：〜0 + a表示什么？

Question

I have seen how to use ~ operator in formula. 我已经看到如何在公式中使用〜运算符。 For example y~x means: y is distributed as x. 例如， y~x表示：y分布为x。

However I am really confused of what does ~0+a means in this code: 但是我真的很困惑这个代码中的~0+a手段：

require(limma)
a = factor(1:3)
model.matrix(~0+a)

Why just model.matrix(a) does not work? 为什么只有model.matrix(a)不起作用？ Why the result of model.matrix(~a) is different from model.matrix(~0+a) ? 为什么model.matrix(~a)的结果与model.matrix(~0+a) ？ And finally what is the meaning of ~ operator here? 最后〜操作符的含义是什么？

Answer 1

~ creates a formula - it separates the righthand and lefthand sides of a formula ~创建一个公式 - 它将公式的右侧和左侧分开

From ?`~` 从?`~`

Tilde is used to separate the left- and right-hand sides in model formula Tilde用于分离模型公式中的左侧和右侧

Quoting from the help for formula 引用公式的帮助

The models fit by, eg, the lm and glm functions are specified in a compact symbolic form. 通过例如lm和glm函数拟合的模型以紧凑的符号形式指定。 The ~ operator is basic in the formation of such models. 〜运算符是这种模型形成的基础。 An expression of the form y ~ model is interpreted as a specification that the response y is modelled by a linear predictor specified symbolically by model. 形式为y~exode的表达式被解释为响应y由由模型符号指定的线性预测器建模的规范。 Such a model consists of a series of terms separated by + operators. 这样的模型由一系列由+运算符分隔的术语组成。 The terms themselves consist of variable and factor names separated by : operators. 术语本身由变量和因子名称组成，由运算符分隔。 Such a term is interpreted as the interaction of all the variables and factors appearing in the term. 这个术语被解释为该术语中出现的所有变量和因素的相互作用。

In addition to + and :, a number of other operators are useful in model formulae. 除了+和：之外，许多其他运算符在模型公式中也很有用。 The * operator denotes factor crossing: a*b interpreted as a+b+a:b. *运算符表示因子交叉：a * b被解释为a + b + a：b。 The ^ operator indicates crossing to the specified degree. ^运算符表示交叉到指定的度数。 For example (a+b+c)^2 is identical to (a+b+c)*(a+b+c) which in turn expands to a formula containing the main effects for a, b and c together with their second-order interactions. 例如（a + b + c）^ 2与（a + b + c）*（a + b + c）相同，后者又扩展为包含a，b和c及其第二个的主效应的公式订单交互。 The %in% operator indicates that the terms on its left are nested within those on the right. ％in％运算符表示其左侧的术语嵌套在右侧的术语中。 For example a + b %in% a expands to the formula a + a:b. 例如，a + b％in％a扩展为公式a + a：b。 The - operator removes the specified terms, so that (a+b+c)^2 - a:b is identical to a + b + c + b:c + a:c. - 运算符删除指定的项，因此（a + b + c）^ 2 - a：b与a + b + c + b：c + a：c相同。 It can also used to remove the intercept term: when fitting a linear model y ~ x - 1 specifies a line through the origin. 它还可用于删除截距项：当拟合线性模型时，y~x - 1指定通过原点的直线。 A model with no intercept can be also specified as y ~ x + 0 or y ~ 0 + x. 没有截距的模型也可以指定为y~x + 0或y~0 + x。

So regarding specific issue with `~a+0` 所以关于`~a+0`具体问题

You creating a model matrix without an intercept. 您创建没有截距的模型矩阵。 As a is a factor, model.matrix(~a) will return an intercept column which is a1 (You need n-1 indicators to fully specify n classes) 作为a因素， model.matrix(~a)将返回一个拦截列，即a1 （你需要n-1指标才能完全指定n类）

The help files for each function are well written, detailed and easy to find! 每个功能的帮助文件都写得很好，详细且易于查找！

why doesn't `model.matrix(a)` work 为什么没有`model.matrix(a)`工作

model.matrix(a) doesn't work because a is a factor variable, not a formula or terms object model.matrix(a)不起作用，因为a是factor变量，而不是公式或术语对象

From the help for model.matrix 来自model.matrix的帮助

object an object of an appropriate class. 对象一个适当类的对象。 For the default method, a model formula or a terms object. 对于默认方法，模型公式或术语对象。

R is looking for a particular class of object, by passing a formula ~a you are passing an object that is of class formula . R正在寻找一个特定的对象类，通过传递一个公式~a你传递一个类formula的对象。 model.matrix(terms(~a)) would also work, (passing the terms object corresponding to the formula ~a model.matrix(terms(~a))也可以工作，（传递对应于公式的术语对象~a

general note 一般说明

@BenBolker helpfully notes in his comment, This is a modified version of Wilkinson-Rogers notation. @BenBolker在评论中有用地指出，这是威尔金森 - 罗杰斯符号的修改版本。

There is a good description in the Introduction to R . R简介中有一个很好的描述。

Answer 2

After reading several manuals, I was confused by the meaning of model.matrix(~0+x) ountil recently that I found this excellent book chapter . 在阅读了几本手册之后，我最近对model.matrix(~0+x)的含义感到困惑，我发现这本书很精彩。

In mathematics 0+a is equal to a and writing a term like 0+a is very strange. 在数学中， 0+a等于a ，写一个像0+a这样的术语非常奇怪。 However we are here dealing with linear models: A simple high-school equation such as y=ax+b that uncovers the relationship between the predictor variable (x) and the observation (y). 然而，我们在这里处理线性模型：一个简单的高中方程，如y=ax+b ，它揭示了预测变量（x）和观测值（y）之间的关系。

So we can think of ~0+x or equally ~x+0 as an equation of the form: y=ax+b . 因此我们可以将~0+x或同等~x+0视为形式的等式： y=ax+b 。 By adding 0 we are forcing b to be zero, that means that we are looking for a line passing the origin (no intercept). 通过加0我们强制b为零，这意味着我们正在寻找一条通过原点的线（没有截距）。 If we indicated a model like ~x+1 or just ~x , there fitted equation could possibily contain a non-zero term b . 如果我们指出像~x+1或只是~x ，那么拟合方程可能包含非零项b 。 Equally we may restrict b by a formula ~x-1 or ~-1+x that both mean: no intercept (the same way we exclude a row or column in R by negative index). 同样，我们可以通过公式~x-1或~-1+x ~x-1 ~-1+x来限制b ，这两者都意味着：没有截距（与我们通过负指数排除R中的行或列的方式相同）。 However something like ~x-2 or ~x+3 is meaningless. 然而，像~x-2或~x+3是没有意义的。

Thanking @mnel for the useful comment, finally what's the reason to use ~ and not = ? 感谢@mnel的有用评论，最后是什么原因使用~而不是= ？ In standard mathematical terminology / symbology y~x denotes that y is equivalent to x, it is somewhat weaker that y=x . 在标准数学术语/符号体系中， y~x表示y等于x， y=x稍微弱一些。 When you are fitting a linear model, you aren't really saying y=x , but more that you can model y as a linear function of x ( y = ax+b for example) 当您拟合线性模型时，您并不是真的说y=x ，而是您可以将y建模为x的线性函数（例如， y = ax+b ）

Answer 3

To answer part of your question, tilde is used to separate the left- and right-hand sides in model formula. 为了回答部分问题，使用波浪号分隔模型公式中的左侧和右侧。 See ?"~" for more help. 请参阅?"~"以获取更多帮助。

R代码运算符：〜0 + a表示什么？

问题描述

3 个解决方案

解决方案1
10 已采纳 2012-10-05 00:21:44

So regarding specific issue with `~a+0` 所以关于`~a+0`具体问题

why doesn't `model.matrix(a)` work 为什么没有`model.matrix(a)`工作

general note 一般说明

解决方案2
5 2012-10-08 00:56:41

解决方案3
2 2012-10-05 00:21:14

R代码运算符：〜0 + a表示什么？

问题描述

3 个解决方案

解决方案1 10 已采纳 2012-10-05 00:21:44

So regarding specific issue with ~a+0 所以关于~a+0具体问题

why doesn't model.matrix(a) work 为什么没有model.matrix(a)工作

general note 一般说明

解决方案2 5 2012-10-08 00:56:41

解决方案3 2 2012-10-05 00:21:14

解决方案1
10 已采纳 2012-10-05 00:21:44

So regarding specific issue with `~a+0` 所以关于`~a+0`具体问题

why doesn't `model.matrix(a)` work 为什么没有`model.matrix(a)`工作

解决方案2
5 2012-10-08 00:56:41

解决方案3
2 2012-10-05 00:21:14