简体   繁体   English

在SAS中使用proc意味着在组中找到最大值?

[英]Finding maximum value in a group using proc means in SAS?

Suppose i have data like this in my table 假设我的表中有这样的数据

match_day name    Goals

1         Higuain   4 
1         Messi     1
1         Ozil      4
1         Villa     3
1         Xavi      4
2        Benzema    4
2        Messi      4
2        Ronaldo    3
2         Villa     4
2         Xavi      4

Now i want to find out which player scored the maximum goals in each match. 现在我想知道哪个球员在每场比赛中得分最高。 I tried using it doing- 我试过用它做 -

  proc means data=b nway max;
  class match_day name;
  var goals;
  output out=c(drop=_type_ _freq_) max=goals;
  run;

But this does not work. 但这不起作用。 What is the correct way of doing this? 这样做的正确方法是什么?

This isn't something you can easily do in PROC MEANS. 这不是你在PROC手段中可以轻松完成的事情。 It's much easier to do in SQL or the data step. 在SQL或数据步骤中更容易做到。 The most direct solution: 最直接的解决方案:

proc sort data=b;
by match_day descending goals; *so the highest goal number is at top;
run; 

data c;
set b;
by match_day;
if first.match_day; *the first record per match_day;
run;

That will give you the record with the largest number of goals. 这将为您提供最多目标的记录。 If there is a tie, you will not get more than one record, but instead arbitrarily the first. 如果有平局,你不会获得多于一个记录,而是任意第一个记录。

If you want to keep all records with that number, you can do: 如果您想保留所有包含该号码的记录,您可以:

data c;
set b;
retain keep;
by match_day descending goals;
if first.match_day then keep=1; *the first record per match_day, flag to keep;
if keep=1 then output;          *output records to be kept;
if last.goals then keep=0;      *clear the keep flag at the end of the first goals set;
drop keep;
run;

Just to clear up the PROC MEANS syntax, you could use the following code to show the top goal scorer per match_day. 要清除PROC MEANS语法,您可以使用以下代码显示每个match_day的最高目标得分者。

proc means data=b noprint nway;
class match_day;
output out=c(drop=_:) maxid(goals(name goals))=;
run;

However, you get the issue raised by @Joe that only one record per match_day is returned, which isn't ideal in this situation where there are ties for top scorer. 但是,你得到@Joe提出的问题,即每个match_day只返回一条记录,这在最佳射手有关系的情况下并不理想。

If you wanted to use a procedure, then PROC RANK can do this for you. 如果您想使用某个程序,那么PROC RANK可以为您执行此操作。

proc rank data=b out=c (where=(goals_rank=1)) ties=low descending;
by match_day;
var goals;
ranks goals_rank;
run;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM