[英]Regression coefficients by group in dataframe R
I have data of various companies' financial information organized by company ticker. 我有公司股票代码组织的各种公司财务信息的数据。 I'd like to regress one of the columns' values against the others while keeping the company constant.
我希望在保持公司不变的同时,将其中一个列的价值与其他列相对应。 Is there an easy way to write this out in
lm()
notation? 有没有一种简单的方法可以用
lm()
表示法写出来?
I've tried using: 我尝试过使用:
reg <- lmList(lead2.dDA ~ paudit1 + abs.d.GINDEX + logcapx + logmkvalt +
logmkvalt2|pp, data=reg.df)
where pp
is a vector of company names, but this returns coefficients as though I regressed all the data at once (and did not separate by company name). 其中
pp
是公司名称的向量,但这会返回系数,就好像我一次退回所有数据(并且没有按公司名称分开)。
A convenient and apparently little-known syntax for estimating separate regression coefficients by group in lm()
involves using the nesting operator, /
. 用于在
lm()
按组估计单独的回归系数的方便且显然鲜为人知的语法涉及使用嵌套运算符/
。 In this case it would look like: 在这种情况下,它看起来像:
reg <- lm(lead2.dDA ~ 0 + pp/(paudit1 + abs.d.GINDEX + logcapx +
logmkvalt + logmkvalt2), data=reg.df)
Make sure that pp
is a factor and not a numeric . 确保
pp
是一个因子而不是数字 。 Also notice that the overall intercept must be suppressed for this to work; 还要注意必须抑制整体拦截才能使其工作; in the new formulation, we have a different "intercept" for each group.
在新的表述中,我们对每个组都有不同的“拦截”。
A couple comments: 一对评论:
lmList()
, it should be noted that with lm()
we estimate only a single residual variance across all the groups, whereas lmList()
would estimate separate residual variances for each group. lmList()
给出的回归系数匹配,但应注意,对于lm()
我们仅估计所有组中的单个残差方差,而lmList()
将估计每个组的单独残差方差。 lmList()
syntax that you gave looks like it should have worked. lmList()
语法看起来应该有效。 Since you say it didn't, this leads me to expect that really the problem is something else (although it's hard to tell what without a reproducible example), and so it seems likely that the solution I posted will fail for you as well, for the same unknown reasons.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.