[英]Merging two datasets with variables in one corresponding observations in the other
我今天遇到了一個挑戰,希望能得到一些幫助。 我想merge
2 個數據集。 Dataset1 包含具有 3 個變量的成員花名冊,批次中的 batch_id、member_num 和職業。 Dataset2 由成員許可證狀態組成。
這里的挑戰是,在 dataset2 中,member_num 被表示為一個變量,其方式是 dataset2 中的 member_num_x 對應於 dataset1 中變量 member_num 下的“x”觀察。 我需要merge
這兩個數據集,以便最后我有一個數據集,其中包含每個成員的 batch_id、member_num、職業和許可證狀態。
Dataset1
| batch_id | member_num | occupation |
| -------- | -------- | --------
| A01 | 1 | Driver |
| A01 | 2 | Driver |
| A01 | 3 | Driver |
| A01 | 4 | Driver |
| A02 | 1 | Navigator |
| A02 | 2 | Navigator |
Dataset2
| batch_id |member_num_1|member_num_2|member_num_3|member_num_4|
| -------- | -------- | -------- | -------- | -------- |
| A01 | Yes | NA | Yes | No |
| A02 | No | | NA |
Desired Output
| batch_id | member_num | occupation | License_status
| -------- | -------- | --------
| A01 | 1 | Driver | Yes
| A01 | 2 | Driver | NA
| A01 | 3 | Driver | Yes
| A01 | 4 | Driver | No
| A02 | 1 | Navigator | No
| A02 | 2 | Navigator | NA
我試過在 Stata 中使用merge
命令,但是沒有選項可以進行這種特殊的合並。 那里的選項使用唯一變量(幾乎與主鍵上的連接相同)。
您需要將 d2 重塑為長格式,然后合並/鏈接 batch_id、member_num
clear
use d1
frame create d2
frame d2: use d2
frame d2: reshape long member_num_, i(batch_id) j(member_num)
frlink 1:1 batch_id member_num, frame(d2)
frget License_status = member_num_, from(d2)
clear
use d2
reshape long member_num_, i(batch_id) j(member_num)
rename member_num_ License_status
tempfile d2long
save `d2long',replace
use d1,clear
merge 1:1 batch_id member_num using `d2long',nogenerate keep(1 3)
batch_id member~m occupat~n d2 Licens~s
1. A01 1 Driver 1 Yes
2. A01 2 Driver 2 NA
3. A01 3 Driver 3 Yes
4. A01 4 Driver 4 No
5. A02 1 Navigator 5 No
6. A02 2 Navigator 6 NA
d1.dta:
batch_id member~m occupat~n
1. A01 1 Driver
2. A01 2 Driver
3. A01 3 Driver
4. A01 4 Driver
5. A02 1 Navigator
6. A02 2 Navigator
d2.dta:
batch_id member~1 member~2 member~3 member~4
1. A01 Yes NA Yes No
2. A02 No NA
小編輯的最終答案:
clear
use "file_Dataset2.dta" reshape long member_num_, i(batch_id) j(member_num) rename member_num_ License_status
tempfile d2long
save d2long',replace
use "file_Dataset1.dta",clear
merge m:1 batch_id member_num using d2long',nogenerate keep(1 3)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.