简体   繁体   English

使用if命令计算状态差异

[英]Calculate a difference in stata with if command

I want to calculate something like 我想计算类似

by group: egen x if y==1 - x if y==2

Of course this is not a real stata code but I'm kind of lost. 当然,这不是真正的Stata代码,但我有点迷失了。 In R this is simply passed by a "[]" behind the variable of intrest but I'm not sure about stata 在R中,它只是通过intrest变量后面的“ []”传递,但是我不确定stata

R would be R将是

x[y==1] - x[y==2]

I would use reshape . 我会用reshape

clear
version 11.2
set seed 2001

* generate data
set obs 100
generate y = 1 + mod(_n - 1, 2)
generate x = rnormal()
generate group = 1 + floor((_n - 1) / 2)
list in 1/10

* reshape to wide and difference
reshape wide x, i(group) j(y)
generate x_diff = x1 - x2
list in 1/5

I would use reshape in R, also. 我也将在R中使用reshape Otherwise can you be sure that everything is properly ordered to give you the difference you want? 否则,您可以确定是否正确地安排了所有步骤以带给您所需的不同吗?

There is likely a neat Mata solution, but I know very little Mata. 可能有一个简单的Mata解决方案,但我对Mata知之甚少。 You may find preserve and restore helpful if you're averse to reshape ing. 如果您不愿reshape则可能会发现preserverestore reshape用。

Richard Herron makes a good point that a reshape to a different structure might be worthwhile. 理查德·赫伦(Richard Herron)指出, reshape其他结构可能是值得的。 Here I focus on how to do it with the existing structure. 在这里,我重点介绍如何使用现有结构进行操作。

Assuming that there are precisely two observations for each group of group , one with y == 1 and one with y == 2 , then 假设每个组的group都有两个观测值,一个y == 1 ,另一个y == 2 ,则

bysort group (y) : gen diff = x[1] - x[2] 

gives the difference between values of x , necessarily repeated for each observation of two in a group. 给出x值之间的差,对于一个组中的两个观察值必须重复一次。 An assumption-free method is 无假设方法是

bysort group: egen mean_1 = mean(x / (y == 1)) 
by group: egen mean_2 = mean(x / (y == 2)) 
gen diff = mean_1 - mean_2 

Consider expressions such as x / (y == 1) . 考虑诸如x / (y == 1)表达式。 Here the denominator y == 1 is 1 when y is indeed 1 and 0 otherwise. 当y确实为1时,分母y == 1为1,否则为0。 Division by 0 yields missing in Stata, but the egen command here ignores those. 在Stata中除以0会egen丢失,但是此处的egen命令忽略了这些。 So the first command of the three commands above yields the mean of x for observations for which y == 1 and the second the mean of x for observations for which y == 2 . 所以上面的三个命令的第一命令产生的平均值x为观测针对y == 1和第二平均值x为观测针对y == 2 Other values of y (even missings) will be ignored. y其他值(甚至丢失)将被忽略。 This method should agree with the first method when the first method is valid. 当第一种方法有效时,此方法应与第一种方法一致。

For a review of similar problems, see http://stata-journal.com/article.html?article=dm0055 有关类似问题的评论,请参见http://stata-journal.com/article.html?article=dm0055

In Stata the if referred to here is a qualifier (not a command). 在Stata中, if在此处指的是限定词(而不是命令)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM