[英]Calculate a difference in stata with if command
I want to calculate something like 我想计算类似
by group: egen x if y==1 - x if y==2
Of course this is not a real stata code but I'm kind of lost. 当然,这不是真正的Stata代码,但我有点迷失了。 In R this is simply passed by a "[]" behind the variable of intrest but I'm not sure about stata 在R中,它只是通过intrest变量后面的“ []”传递,但是我不确定stata
R would be R将是
x[y==1] - x[y==2]
I would use reshape
. 我会用reshape
。
clear
version 11.2
set seed 2001
* generate data
set obs 100
generate y = 1 + mod(_n - 1, 2)
generate x = rnormal()
generate group = 1 + floor((_n - 1) / 2)
list in 1/10
* reshape to wide and difference
reshape wide x, i(group) j(y)
generate x_diff = x1 - x2
list in 1/5
I would use reshape
in R, also. 我也将在R中使用reshape
。 Otherwise can you be sure that everything is properly ordered to give you the difference you want? 否则,您可以确定是否正确地安排了所有步骤以带给您所需的不同吗?
There is likely a neat Mata solution, but I know very little Mata. 可能有一个简单的Mata解决方案,但我对Mata知之甚少。 You may find preserve
and restore
helpful if you're averse to reshape
ing. 如果您不愿reshape
则可能会发现preserve
和restore
reshape
用。
Richard Herron makes a good point that a reshape
to a different structure might be worthwhile. 理查德·赫伦(Richard Herron)指出, reshape
其他结构可能是值得的。 Here I focus on how to do it with the existing structure. 在这里,我重点介绍如何使用现有结构进行操作。
Assuming that there are precisely two observations for each group of group
, one with y == 1
and one with y == 2
, then 假设每个组的group
都有两个观测值,一个y == 1
,另一个y == 2
,则
bysort group (y) : gen diff = x[1] - x[2]
gives the difference between values of x
, necessarily repeated for each observation of two in a group. 给出x
值之间的差,对于一个组中的两个观察值必须重复一次。 An assumption-free method is 无假设方法是
bysort group: egen mean_1 = mean(x / (y == 1))
by group: egen mean_2 = mean(x / (y == 2))
gen diff = mean_1 - mean_2
Consider expressions such as x / (y == 1)
. 考虑诸如x / (y == 1)
表达式。 Here the denominator y == 1
is 1 when y is indeed 1 and 0 otherwise. 当y确实为1时,分母y == 1
为1,否则为0。 Division by 0 yields missing in Stata, but the egen
command here ignores those. 在Stata中除以0会egen
丢失,但是此处的egen
命令忽略了这些。 So the first command of the three commands above yields the mean of x
for observations for which y == 1
and the second the mean of x
for observations for which y == 2
. 所以上面的三个命令的第一命令产生的平均值x
为观测针对y == 1
和第二平均值x
为观测针对y == 2
。 Other values of y
(even missings) will be ignored. y
其他值(甚至丢失)将被忽略。 This method should agree with the first method when the first method is valid. 当第一种方法有效时,此方法应与第一种方法一致。
For a review of similar problems, see http://stata-journal.com/article.html?article=dm0055 有关类似问题的评论,请参见http://stata-journal.com/article.html?article=dm0055
In Stata the if
referred to here is a qualifier (not a command). 在Stata中, if
在此处指的是限定词(而不是命令)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.