[英]model.matrix works for "month" column of df but gives unexpected output for "week" column
I am trying to construct a model matrix using model.matrix
.我正在尝试使用
model.matrix
构建模型矩阵。 Here's my data, stored as a data frame called wILI
:这是我的数据,存储为名为
wILI
的数据框:
date value week month year
1997-10-01 0.002734167 1 10 1997
1997-10-08 0.003612784 2 10 1997
1997-10-15 0.004757731 3 10 1997
1997-10-22 0.006238000 4 10 1997
1997-10-29 0.008132015 5 10 1997
1997-11-05 0.010522688 6 11 1997
1997-11-12 0.013487294 7 11 1997
1997-11-19 0.017080349 8 11 1997
1997-11-26 0.021308731 9 11 1997
1997-12-03 0.026101156 10 12 1997
1997-12-10 0.031279133 11 12 1997
1997-12-17 0.036542190 12 12 1997
1997-12-24 0.041482753 13 12 1997
1997-12-31 0.045640193 14 12 1997
1998-01-07 0.048587584 15 01 1998
1998-01-14 0.050025386 16 01 1998
1998-01-21 0.049847167 17 01 1998
1998-01-28 0.048152678 18 01 1998
1998-02-04 0.045207680 19 02 1998
1998-02-11 0.041371773 20 02 1998
1998-02-18 0.037022686 21 02 1998
1998-02-25 0.032498271 22 02 1998
1998-03-04 0.028064335 23 03 1998
1998-03-11 0.023905745 24 03 1998
1998-03-18 0.020133246 25 03 1998
1998-03-25 0.016798043 26 03 1998
1998-04-01 0.013908254 27 04 1998
1998-04-08 0.011443810 28 04 1998
1998-04-15 0.009368329 29 04 1998
1998-04-22 0.007637759 30 04 1998
1998-04-29 0.006206186 31 04 1998
1998-05-06 0.005029414 32 05 1998
1998-05-13 0.004066965 33 05 1998
1998-05-20 0.003282970 34 05 1998
1998-05-27 0.002646398 35 05 1998
I am testing two models for the wILI data, one with a month regressor and the other with a week regressor.我正在为 wILI 数据测试两个模型,一个带有一个月回归量,另一个带有一周回归量。 That is, I want a coefficient for each month (model 1), and each week (model 2).
也就是说,我想要每个月(模型 1)和每周(模型 2)的系数。 For the above data, the possible months are 1,2,3,4,5,10,11,12 and the possible weeks are 1,2,...,35.
对于上述数据,可能的月份是 1,2,3,4,5,10,11,12,可能的周数是 1,2,...,35。 When I use
model.matrix(~ 0 + month, wILI)
, it works as expected:当我使用
model.matrix(~ 0 + month, wILI)
时,它按预期工作:
month01 month02 month03 month04 month05 month10 month11 month12
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
0 0 0 1 0 0 0 0
0 0 0 1 0 0 0 0
0 0 0 1 0 0 0 0
0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 1 0 0 0
The element in the ith row has a 1 in the column of its corresponding month, and zeros in all the other columns, just like I want.第 i 行中的元素在其对应月份的列中为 1,在所有其他列中为零,就像我想要的那样。 But when I try the same thing using "week" instead of "month", I get this:
但是当我使用“week”而不是“month”尝试同样的事情时,我得到了这个:
week
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
...Huh?? ……嗯?? Why am I getting a 35x1 vector?
为什么我得到一个 35x1 的矢量? I want a 35x35 matrix where the first row has a 1 in the first column and zeros everywhere else, the second row has a 1 in the second column and zeros everywhere else, the third row has a 1 in the third column and zeros everywhere else, etc (ie the 35x35 identity matrix).
我想要一个 35x35 矩阵,其中第一行在第一列有 1,其他地方为零,第二行在第二列有 1,其他地方为零,第三行在第三列有 1,其他地方为零等(即 35x35 单位矩阵)。 Any suggestions for how to accomplish this?
关于如何做到这一点的任何建议? And why should the output be so different by simply changing "month" to "week"?
为什么只需将“月”更改为“周”,输出就会如此不同?
Ensure that week and month are factor (or character).确保周和月是因素(或字符)。 Numeric predictors become a single column in the model matrix whereas a factor generates a column for each level or all except one level if there is an intercept.
数值预测变量成为模型矩阵中的一列,而一个因子为每个级别或所有级别生成一列,如果存在截距,则为除一个级别之外的所有级别。 If the column were already factor or character then factor(...) surrounding the variable could be omitted.
如果该列已经是因子或字符,则可以省略围绕变量的因子(...)。
model.matrix(~ factor(month) + 0, wILI)
model.matrix(~ factor(week) + 0, wILI)
Another way to write this which gives nicer coefficient names is:另一种写出更好的系数名称的方法是:
model.matrix(~ month + 0, transform(wILI, month = factor(month)))
model.matrix(~ week + 0, transform(wILI, week = factor(week)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.