简体   繁体   English

如何按组最后选择SAS中的某些内容

[英]How can I select last by group where something in SAS

Say I have a table Tbl that is sorted by 3 columns {a,b,c} I also have another 100 columns, one of them is d . 假设我有一个按3列{a,b,c}排序的表Tbl ,我也有另外100列,其中之一是d How can I flag the last row by a group such that d=something , the flag shall be a new column. 我如何last row by a group such that d=something标记last row by a group such that d=something ,标记应为新列。 Hopefully this is doable withOUT re-sorting the whole table 希望这可以在不重新排序整个表格的情况下完成

a b c ...many columns... d IDX
1                        5 1                        
1                        3 2
1                        3 3
2                        3 4 
2                        3 5
2                        2 6
2                        2 7

On this table we want to add another column newCol to flag the last row by group a where d = 3 在此表上,我们要添加另一列newCollast row by group a where d = 3标记last row by group a where d = 3

a b c ...many columns... d IDX newCol
1                        5 1   0                     
1                        3 2   0
1                        3 3   1
2                        3 4   0
2                        3 5   1
2                        2 6   0
2                        2 7   0
data want;
set have;
by a d notsorted;
if last.d and d=3 then flag=1;
run;

This requires the dataset to be sorted in a useful fashion - it doesn't have to be in order by d, but it does have to have all the d's of one value together (ie, not 3 3 1 3 4 1 2 3 but 3 3 3 3 4 1 1 2 is fine). 这要求数据集以有用的方式进行排序-不必按d进行排序,但必须将一个值的所有d放在一起(即,不是3 3 1 3 4 1 2 3 3 3 3 3 4 1 1 2可以)。

If that's not the case, then there isn't a solution that doesn't rely on sorting in some fashion, whether it be SQL (which does sort the data, it just doesn't tell you it's doing it), PROC SORT , or a hash table (which if you can fit everything into memory might be the fastest sort). 如果不是这种情况,那么就没有一种不依赖某种排序方式的解决方案,无论是SQL(可以对数据进行排序,只是不告诉您它正在这样做), PROC SORT ,或哈希表(如果您可以将所有内容都放入内存,则可能是最快的排序)。

I'm not sure how this gets implemented, but the following does the work that you want: 我不确定这是如何实现的,但是以下是您想要的工作:

proc sql;
    select a, b, c, . . .
    from t
    group by a, b
    having c = max(c);

Note that this syntax is quite specific to SAS proc sql. 请注意,此语法非常特定于SAS proc sql。 It is not ANSI standard and will not work in most other databases. 它不是ANSI标准,因此无法在大多数其他数据库中使用。

This uses a process called "remerging". 这使用称为“重新合并”的过程。 I'm not sure if it resorts the original table. 我不确定是否使用原始表。

EDIT: 编辑:

Flagging the lines is just as easy: 标记线条很容易:

proc sql;
    select a, b, c, (case when c = max(c) then 'Y' else 'N' end) as flag, . . .
    from t
    group by a, b;

However, if the data is already sorted, it is probably more efficient to use a data step for this purpose. 但是,如果已经对数据进行了排序,则为此目的使用数据步骤可能会更有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我可以对不在SELECT行中的内容进行分组吗? - Can I group by something that isn't in the SELECT line? 我可以使用两个where子句,例如“SELECT * FROM table WHERE something and something”? - Can I use two where clauses, like “SELECT * FROM table WHERE something and something”? 在 SAS 中,如何选择 ID 组中其他变量之间具有特定关系的所有 ID 组? - In SAS, how can I select all the ID groups which has specific relationship between another variables within the ID group? 如何在sas或sql中用组平均值替换0? - How can I Replace 0 with group mean value in sas or sql? 我怎么能选择其他东西 - how can I select something else with count 如何在 SQL 中为给定组选择第一条和倒数第二条记录? - How can I select the first and second to last record for a given group in SQL? 在这种情况下,如何从具有不同where条件和不同net分组条件的同一张表中进行选择? - How can I select from the same table with different where conditions and differnet group by conditions in this case? 如何从数据库中选择倒数第二行和最后一行? - How can I select the second last and last row from the database? 如何从表中选择某些东西等于变量的东西 - How to select something from a table where something is equal to a variable 如何在选择的地方返回2行? - How can i return 2 rows in a where select?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM