简体   繁体   中英

Frequencies not adding up (SAS PROC SQL)

I'm trying to find the frequencies for only unique ID numbers. I tried PROC FREQ, but couldn't figure out how to do whatever the SAS equivalent of SELECT DISTINCT is. I ran the following code and got numbers that don't add up.

Code: PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n; PROC SQL; SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n;

Result: 20599

Code:

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '1a (obs): Demonstrating knowledge of content and pedagogy';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '1a (p&p): Demonstrating knowledge of content and pedagogy';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '1e (obs): Designing coherent instruction';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '1e (p&p): Designing coherent instruction';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '2a: Creating an environment of respect and rapport';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '2d: Managing student behavior';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '3b: Using questioning and discussion techniques';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '3c: Engaging students in learning';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '3d: Using assessment in instruction';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '4e (obs): Growing and developing     professionally';

PROC SQL;
SELECT COUNT (DISTINCT MOTPID) FROM WORK.'0__1_MOTP_COMMENTS_0000'n
WHERE MOTPComponentDescription = '4e (p&p): Growing and developing professionally';

View a snippet of the dataset here: https://docs.google.com/spreadsheets/d/1WDcsezb4xiT67J9t3Nlyi_QEofs0dhyZ23yC32ccbqg/edit?usp=sharing

Result: 1a (obs): Demonstrating knowledge of content and pedagogy: 700

1a (p&p): Demonstrating knowledge of content and pedagogy: 606

1e (obs): Designing coherent instruction: 15622

1e (p&p): Designing coherent instruction: 1135

2a: Creating an environment of respect and rapport: 2466

2d: Managing student behavior: 1005

3b: Using questioning and discussion techniques: 808

3c: Engaging students in learning: 2516

3d: Using assessment in instruction: 3058

4e (obs): Growing and developing professionally: 5245

4e (p&p): Growing and developing professionally: 588

SUM = 33746

33746 != 20599

Looking for any ideas on what went wrong or if there's a better way to get my desired result (the count of unique MOTPID's by MOTPCopmponentDescription. Thanks so much in advance!

To discuss SAS issues on StackOverflow, the example data on the SASHELP library comes in very handy. Let us use the CARS dataset. ;

title "What you see as a problem is no problem";

title2 "Counting all makes";

proc sql;
    select count (distinct Make) as distinct_makes from sashelp.cars;
quit;
  • gives 38 ;

title2 "Counting the makes that produce cars with a certain number of cylinders";

proc sql;
    select 'n.a.' as Cylinders, count (distinct Make) as distinct_makes from sashelp.cars where Cylinders = . union
    select ' 3  ' as Cylinders, count (distinct Make) as distinct_makes from sashelp.cars where Cylinders = 3 union
    select ' 4  ' as Cylinders, count (distinct Make) as distinct_makes from sashelp.cars where Cylinders = 4 union
    select ' 5  ' as Cylinders, count (distinct Make) as distinct_makes from sashelp.cars where Cylinders = 5 union
    select ' 6  ' as Cylinders, count (distinct Make) as distinct_makes from sashelp.cars where Cylinders = 6 union
    select ' 8  ' as Cylinders, count (distinct Make) as distinct_makes from sashelp.cars where Cylinders = 8 union
    select '10  ' as Cylinders, count (distinct Make) as distinct_makes from sashelp.cars where Cylinders = 10 union
    select '12  ' as Cylinders, count (distinct Make) as distinct_makes from sashelp.cars where Cylinders = 12;
quit;
  • gives 1 make producing 3 cylinders, 26 producint 4 cylinders and so forth, "adding up" to more than 80 ;

title2 "You can manually verify the results in these listings";

proc sql;
    select Cylinders, Make, Model from sashelp.cars order by Cylinders, Make;
    select Make, Cylinders, Model from sashelp.cars order by Make, Cylinders;
quit;

title "What you call a solution is producing unpredictable results";

title2 "It produces this results if the input is sorted one way";

proc sort data=sashelp.cars out=cars_short2long;
    by length;
run;
proc sort data=cars_short2long nodupkey out=cars_short2long_clean dupout=dups;
    by Make;
run;
proc freq data=cars_short2long_clean;
    table Cylinders;
run;
  • indicating that no one would make 10 cylinder cars ;

title2 "It produces this results if the input is sorted another way";

proc sort data=sashelp.cars out=cars_long2short;
    by descending length;
run;
proc sort data=cars_long2short nodupkey out=cars_long2short_clean dupout=dups;
    by Make;
run;
proc freq data=cars_long2short_clean;
    table Cylinders;
run;
  • indicating that no one would make 3 cylinder cars ;

Here is the solution I worked out, which got the exact result I was looking for:

data comment_analysis;
set WORK.'0__1_MOTP_COMMENTS_0001'n;
run;

proc sort data=comment_analysis nodupkey out=comment_analysis_clean dupout=dups;
by motpid;
run;

proc freq data=comment_analysis_clean;
table MOTPComponentDescription;
run;

Here is the output I was looking for: MOTPComponentDescription Frequency Percent

1a (obs): Demonstrating knowledge of content and pedagogy 520 2.52%

1a (p&p): Demonstrating knowledge of content and pedagogy 400 1.94%

1e (obs): Designing coherent instruction 11423 55.45%

1e (p&p): Designing coherent instruction 526 2.55%

2a: Creating an environment of respect and rapport 1629 7.91%

2d: Managing student behavior 556 2.70%

3b: Using questioning and discussion techniques 563 2.73%

3c: Engaging students in learning 1593 7.73%

3d: Using assessment in instruction 1818 8.83%

4e (obs): Growing and developing professionally 1235 6%

4e (p&p): Growing and developing professionally 336 1.64%

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM