簡體   English   中英

Stata循環以年為字符串變量的觀測

[英]Stata loop over observations with years as string variables

我正在嘗試編寫一個循環來生成並填充一個虛擬變量,用於確定個人是否在所討論的年份中是特定政黨的成員。 我的數據很長,每次觀察都是一個人,一年。 看起來如下。

X1                  X2                  X3                 
AR, 1972-1981       PDC, 1982-1986      PFL, 1986-.  
MD, 1966-1980       PMDB, 1980-1988     PSB, 1988-.  
MD, 1966-1968       AR, 1968-1980       PDS, 1980-1985

在逗號之前是黨,之后是該人成為黨員的年份。 任何幫助將不勝感激!

到目前為止,我擁有的代碼是:

rename X1 XA  
rename X2 XB  
rename X3 XC

foreach var of varlist XA XB XC{  
  split `var', parse (,)  
}
tabulate XA1, gen(p)

這是一種方法。 我必須對X3中缺少的年份對應一個假設,因此您需要進行更改。

/* Enter Data */
clear

input str20 X1 str20 X2 str20 X3                 
"AR, 1972-1981"       "PDC, 1982-1986"      "PFL, 1986-."  
"MD, 1966-1980"       "PMDB, 1980-1988"     "PSB, 1988-."  
"MD, 1966-1968"       "AR, 1968-1980"       "PDS, 1980-1985"
end

compress

/* Split X1,X2,X3 into party, start year and end year and create 3 ID variables that we need later */
forvalues v=1/3 {
    split X`v', parse(", " "-")
    gen id`v'=_n
}

/* Makes years numeric, and get rid of messy original data */
destring X12 X13 X22 X23 X32 X33, replace
replace X33 = 1990 if missing(X33) // enter your survey year here 
drop X1 X2 X3

/* stack the spells on top of each other */
stack (id1 X11 X12 X13) (id2 X21 X22 X23) (id3 X31 X32 X33), into(id party year1 year2) clear
drop _stack

/* Put the data into long format and fill in the gaps */
reshape long year, i(id party) j(p)
drop p
/* need this b/c people can be in more than one party in a given year */
egen idparty = group(id party), label
xtset idparty year
tsfill
carryforward id party, replace
drop idparty

/* create party dummies */
tab party, gen(DD_)

/* rename the dummies to have party affiliation at the end instead of numbers */
foreach var of varlist DD_* {
    levelsof party if `var'==1, local(party) clean
    rename `var' ind_`party'
}

drop party

/* get back down to one person-year observation */
collapse (max) ind_*, by(id year)

list id year ind_*, sepby(id) noobs

在Dimitriy的領導(和解釋)之后,這是一種稍微不同的方法。 對於丟失的端點,我做出了不同的假設,即,將系列截斷為最近的已知年份。

clear
set more off

input ///
str15 (XA                  XB                  XC)                 
"AR, 1972-1981"       "PDC, 1982-1986"     "PFL, 1986-."
"MD, 1966-1980"       "PMDB, 1980-1988"    "PSB, 1988-."
"MD, 1966-1968"       "AR, 1968-1980"    "PDS, 1980-1985"
end

list

*----- what you want? -----

// main
stack X*, into(X) clear
bysort _stack: gen id = _n
order id, first

split X, parse (, -)
rename (X1 X2 X3) (party sdate edate)

destring ?date, replace
gen diff = edate - sdate + 1
expand diff

bysort id party: replace sdate = sdate[1] + _n - 1

drop _stack X edate diff

// create indicator variables
tabulate party, gen(y)

// fix years with two or more parties
levelsof party, local(lp) clean
collapse (sum) y*, by(id sdate)

// rename
unab ly: y*
rename (`ly') (`lp')

list, sepby(id)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM