使用if命令计算状态差异

Question

I want to calculate something like 我想计算类似

by group: egen x if y==1 - x if y==2

Of course this is not a real stata code but I'm kind of lost. 当然，这不是真正的Stata代码，但我有点迷失了。 In R this is simply passed by a "[]" behind the variable of intrest but I'm not sure about stata 在R中，它只是通过intrest变量后面的“ []”传递，但是我不确定stata

R would be R将是

x[y==1] - x[y==2]

Answer 1

I would use reshape . 我会用reshape 。

clear
version 11.2
set seed 2001

* generate data
set obs 100
generate y = 1 + mod(_n - 1, 2)
generate x = rnormal()
generate group = 1 + floor((_n - 1) / 2)
list in 1/10

* reshape to wide and difference
reshape wide x, i(group) j(y)
generate x_diff = x1 - x2
list in 1/5

I would use reshape in R, also. 我也将在R中使用reshape 。 Otherwise can you be sure that everything is properly ordered to give you the difference you want? 否则，您可以确定是否正确地安排了所有步骤以带给您所需的不同吗？

There is likely a neat Mata solution, but I know very little Mata. 可能有一个简单的Mata解决方案，但我对Mata知之甚少。 You may find preserve and restore helpful if you're averse to reshape ing. 如果您不愿reshape则可能会发现preserve和restore reshape用。

Answer 2

Richard Herron makes a good point that a reshape to a different structure might be worthwhile. 理查德·赫伦（Richard Herron）指出， reshape其他结构可能是值得的。 Here I focus on how to do it with the existing structure. 在这里，我重点介绍如何使用现有结构进行操作。

Assuming that there are precisely two observations for each group of group , one with y == 1 and one with y == 2 , then 假设每个组的group都有两个观测值，一个y == 1 ，另一个y == 2 ，则

bysort group (y) : gen diff = x[1] - x[2]

gives the difference between values of x , necessarily repeated for each observation of two in a group. 给出x值之间的差，对于一个组中的两个观察值必须重复一次。 An assumption-free method is 无假设方法是

bysort group: egen mean_1 = mean(x / (y == 1)) 
by group: egen mean_2 = mean(x / (y == 2)) 
gen diff = mean_1 - mean_2

Consider expressions such as x / (y == 1) . 考虑诸如x / (y == 1)表达式。 Here the denominator y == 1 is 1 when y is indeed 1 and 0 otherwise. 当y确实为1时，分母y == 1为1，否则为0。 Division by 0 yields missing in Stata, but the egen command here ignores those. 在Stata中除以0会egen丢失，但是此处的egen命令忽略了这些。 So the first command of the three commands above yields the mean of x for observations for which y == 1 and the second the mean of x for observations for which y == 2 . 所以上面的三个命令的第一命令产生的平均值x为观测针对y == 1和第二平均值x为观测针对y == 2 。 Other values of y (even missings) will be ignored. y其他值（甚至丢失）将被忽略。 This method should agree with the first method when the first method is valid. 当第一种方法有效时，此方法应与第一种方法一致。

For a review of similar problems, see http://stata-journal.com/article.html?article=dm0055 有关类似问题的评论，请参见http://stata-journal.com/article.html?article=dm0055

In Stata the if referred to here is a qualifier (not a command). 在Stata中， if在此处指的是限定词（而不是命令）。

使用if命令计算状态差异

问题描述

2 个解决方案

解决方案1
2 已采纳 2013-01-23 14:37:24

解决方案2
2 2013-01-23 17:02:03

使用if命令计算状态差异

问题描述

2 个解决方案

解决方案1 2 已采纳 2013-01-23 14:37:24

解决方案2 2 2013-01-23 17:02:03

解决方案1
2 已采纳 2013-01-23 14:37:24

解决方案2
2 2013-01-23 17:02:03