[英]SAS calculating new variable for a new dataset from an existing Dataset
这是给定数据集projet.details_etest
:
"survey_instance_id" "user_id" "question_id" "Item_correct"
'"2008" "14389" "4243" "0"
'"2008" "14489" "4243" "1"
'"2008" "14499" "4253" "0"
'"2008" "1669" "4253" "1"
我想创建一个名为projet.resume_question
的新数据集,其中包含按question_id
排序的数据集详细信息,其中包含变量:
survey_instance_id
question_id
nb_correct_answers
nb_incorrect answers
nb_omitted_answers
nb_total_with_omitted_answers
nb_total_without_omitted_answers
变量nb_omitted_answers
是参与者总数减去nb_correct_answers
,每个问题的正确答案的数量,减去nb_incorrect_answers
,每个问题的错误答案的数量。
变量nb_total_with_omitted_answers
是已参加测试的参与者总数。
变量nb_total_without_omitted_answers
是回答每个问题的参与者总数。
这是我所做的:
data projet.resume_question;
set projet.details_etest;
by question_id;
keep survey_instance_id question_id nb_correct_answers nb_incorrect_answers;
retain nb_correct_answers 0 nb_incorrect_answers 0;
if Item_correct =1 then correct_answers= Item_correct;
else if Item_correct =0 then incorrect_answers= Item_correct;
nb_correct_answers = sum (correct_answers);
nb_incorrect_answers= sum (incorrect_answers);
run;
proc print data=projet.resume_question;
run;
我以这种方式开始,打印时发现的东西对我来说似乎是错误的。 有人能帮助我吗?
首先按调查,问题,参与者对数据集进行排序。
proc sort data = projet.details_etest out = details;
by survey_instance_id question_id user_id;
run;
现在获取每个调查的参与者人数。
proc sql;
create table participated as
select survey_instance_id,
count(distinct user_id) as nb_total_with_omitted_answers
from details
group by survey_instance_id;
quit;
通过调查,问题计算合计。
data aggregated;
set details;
by survey_instance_id question_id;
retain nb_total_without_omitted_answers
nb_correct_answers nb_incorrect_answers 0;
if first.question_id then do;
nb_total_without_omitted_answers = 0;
nb_correct_answers = 0;
nb_incorrect_answers = 0;
end;
if item_correct in (0, 1) then nb_total_without_omitted_answers + 1;
if item_correct = 1 then nb_correct_answers + 1;
else if item_correct = 0 then nb_incorrect_answers + 1;
if last.question_id then output;
drop user_id item_correct;
run;
最后,计算每个问题省略的答案的数量。
data projet.resume_question;
merge participated aggregated;
by survey_instance_id;
nb_omitted_answers = nb_total_with_omitted_answers -
nb_correct_answers - nb_incorrect_answers;
run;
这应该为您提供所需的东西。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.