简体   繁体   English

model.matrix 适用于 df 的“月”列,但为“周”列提供了意外的输出

[英]model.matrix works for "month" column of df but gives unexpected output for "week" column

I am trying to construct a model matrix using model.matrix .我正在尝试使用model.matrix构建模型矩阵。 Here's my data, stored as a data frame called wILI :这是我的数据,存储为名为wILI的数据框:

date       value      week month year
1997-10-01  0.002734167 1   10  1997
1997-10-08  0.003612784 2   10  1997
1997-10-15  0.004757731 3   10  1997
1997-10-22  0.006238000 4   10  1997
1997-10-29  0.008132015 5   10  1997
1997-11-05  0.010522688 6   11  1997
1997-11-12  0.013487294 7   11  1997
1997-11-19  0.017080349 8   11  1997
1997-11-26  0.021308731 9   11  1997
1997-12-03  0.026101156 10  12  1997
1997-12-10  0.031279133 11  12  1997
1997-12-17  0.036542190 12  12  1997
1997-12-24  0.041482753 13  12  1997
1997-12-31  0.045640193 14  12  1997
1998-01-07  0.048587584 15  01  1998
1998-01-14  0.050025386 16  01  1998
1998-01-21  0.049847167 17  01  1998
1998-01-28  0.048152678 18  01  1998
1998-02-04  0.045207680 19  02  1998
1998-02-11  0.041371773 20  02  1998
1998-02-18  0.037022686 21  02  1998
1998-02-25  0.032498271 22  02  1998
1998-03-04  0.028064335 23  03  1998
1998-03-11  0.023905745 24  03  1998
1998-03-18  0.020133246 25  03  1998
1998-03-25  0.016798043 26  03  1998
1998-04-01  0.013908254 27  04  1998
1998-04-08  0.011443810 28  04  1998
1998-04-15  0.009368329 29  04  1998
1998-04-22  0.007637759 30  04  1998
1998-04-29  0.006206186 31  04  1998
1998-05-06  0.005029414 32  05  1998
1998-05-13  0.004066965 33  05  1998
1998-05-20  0.003282970 34  05  1998
1998-05-27  0.002646398 35  05  1998 

I am testing two models for the wILI data, one with a month regressor and the other with a week regressor.我正在为 wILI 数据测试两个模型,一个带有一个月回归量,另一个带有一周回归量。 That is, I want a coefficient for each month (model 1), and each week (model 2).也就是说,我想要每个月(模型 1)和每周(模型 2)的系数。 For the above data, the possible months are 1,2,3,4,5,10,11,12 and the possible weeks are 1,2,...,35.对于上述数据,可能的月份是 1,2,3,4,5,10,11,12,可能的周数是 1,2,...,35。 When I use model.matrix(~ 0 + month, wILI) , it works as expected:当我使用model.matrix(~ 0 + month, wILI)时,它按预期工作:

month01 month02 month03 month04 month05 month10 month11 month12
0   0   0   0   0   1   0   0
0   0   0   0   0   1   0   0
0   0   0   0   0   1   0   0
0   0   0   0   0   1   0   0
0   0   0   0   0   1   0   0
0   0   0   0   0   0   1   0
0   0   0   0   0   0   1   0
0   0   0   0   0   0   1   0
0   0   0   0   0   0   1   0
0   0   0   0   0   0   0   1
0   0   0   0   0   0   0   1
0   0   0   0   0   0   0   1
0   0   0   0   0   0   0   1
0   0   0   0   0   0   0   1
1   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0
1   0   0   0   0   0   0   0
0   1   0   0   0   0   0   0
0   1   0   0   0   0   0   0
0   1   0   0   0   0   0   0
0   1   0   0   0   0   0   0
0   0   1   0   0   0   0   0
0   0   1   0   0   0   0   0
0   0   1   0   0   0   0   0
0   0   1   0   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   1   0   0   0   0
0   0   0   0   1   0   0   0
0   0   0   0   1   0   0   0
0   0   0   0   1   0   0   0
0   0   0   0   1   0   0   0

The element in the ith row has a 1 in the column of its corresponding month, and zeros in all the other columns, just like I want.第 i 行中的元素在其对应月份的列中为 1,在所有其他列中为零,就像我想要的那样。 But when I try the same thing using "week" instead of "month", I get this:但是当我使用“week”而不是“month”尝试同样的事情时,我得到了这个:

week
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

...Huh?? ……嗯?? Why am I getting a 35x1 vector?为什么我得到一个 35x1 的矢量? I want a 35x35 matrix where the first row has a 1 in the first column and zeros everywhere else, the second row has a 1 in the second column and zeros everywhere else, the third row has a 1 in the third column and zeros everywhere else, etc (ie the 35x35 identity matrix).我想要一个 35x35 矩阵,其中第一行在第一列有 1,其他地方为零,第二行在第二列有 1,其他地方为零,第三行在第三列有 1,其他地方为零等(即 35x35 单位矩阵)。 Any suggestions for how to accomplish this?关于如何做到这一点的任何建议? And why should the output be so different by simply changing "month" to "week"?为什么只需将“月”更改为“周”,输出就会如此不同?

Ensure that week and month are factor (or character).确保周和月是因素(或字符)。 Numeric predictors become a single column in the model matrix whereas a factor generates a column for each level or all except one level if there is an intercept.数值预测变量成为模型矩阵中的一列,而一个因子为每个级别或所有级别生成一列,如果存在截距,则为除一个级别之外的所有级别。 If the column were already factor or character then factor(...) surrounding the variable could be omitted.如果该列已经是因子或字符,则可以省略围绕变量的因子(...)。

model.matrix(~ factor(month) + 0, wILI)
model.matrix(~ factor(week) + 0, wILI)

Another way to write this which gives nicer coefficient names is:另一种写出更好的系数名称的方法是:

model.matrix(~ month + 0, transform(wILI, month = factor(month)))
model.matrix(~ week + 0, transform(wILI, week = factor(week)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM