简体   繁体   English

如何在SAS中将事实表联接到(Kimball类型2?)缓慢更改的日期

[英]How to join a fact table to a (Kimball type 2?) slowly changing date in SAS

New to SQL - I'd like to join the fact table crselist to the crseinfo table to get the correct dimension info. SQL的新功能-我想将事实表crselist加入crseinfo表中以获得正确的维度信息。 I've been working on some correlated subqueries but none give the desired result (below). 我一直在研究一些相关的子查询,但没有一个能提供理想的结果(如下)。 The crseinfo table says that beginning in 199610 Art 508 belongs to college 09 and should be called OkArt..which updates in 200220 and 200300. Crselist lists the courses actually taught. crseinfo表显示,从199610年开始,Art 508属于大学09,应该称为OkArt ..,该名称在200220年和200300年进行了更新。Crselist列出了实际教授的课程。

data crseinfo ; 
input crsenme $ crsenum crsefx crsecollege $ crsedesc $9.;
cards;
ART 508 199610 09 OkArt
ART 508 200220 18 WowItsArt
ART 508 200300 18 SuperArt
;
run;

data crselist; 
input  crsenme $ crsenum term section $; 
cards;
ART 508 199610 01
ART 508 199610 02
ART 508 199610 03
ART 508 199710 01
ART 508 200220 01
ART 508 200220 02
ART 508 201020 01
ART 508 201120 01
;
run;

The desired result would then be: 所需的结果将是:

data desired ; 
input  crsenme $ crsenum term section $ crsecollege $ crsedesc $9.;
cards;
ART 508 199610 01 09 OkArt
ART 508 199610 02 09 OkArt
ART 508 199610 03 09 OkArt
ART 508 199710 01 09 OkArt
ART 508 200220 01 18 WowItsArt
ART 508 200220 02 18 WowItsArt
ART 508 201020 01 18 SuperArt
ART 508 201120 01 18 SuperArt
;

Referring to the SAS help page ( http://web.utk.edu/sas/OnlineTutor/1.2/en/60477/m70/m70_52.htm ) it would seem like I could do something like: 参照SAS帮助页面( http://web.utk.edu/sas/OnlineTutor/1.2/en/60477/m70/m70_52.htm ),似乎我可以执行以下操作:

proc sql ; 
select * 
from crseinfo a, crselist b
where a.crsenme eq b.crsenme and 
  a.crsenum eq b.crsenum and 
  b.term eq (select min(c.term) 
   from crselist c 
   where c.term ge a.crsefx )
   ;
quit;

But this does not work. 但这是行不通的。 I am interested in a SQL-based solution - Thank-you for your time. 我对基于SQL的解决方案感兴趣-谢谢您的时间。

You're nearly there. 你快到了 Rather than using a correlated subquery, I think it's simpler to do this using a combination of having and group by clauses: 我认为与其使用关联的子查询,不如使用havinggroup by子句的组合来执行此操作更简单:

proc sql noprint _method;
    create table desired2 as
        select a.*, b.crsecollege, b.crsedesc
            from  crselist a left join crseinfo b
                on a.crsenme = b.crsenme and a.crsenum = b.crsenum
                    where a.term ge b.crsefx
                        group by a.crsenme, a.crsenum, a.term
                            having b.crsefx = max(b.crsefx)
;
quit;

Slightly simpler version: 简单一点的版本:

proc sql noprint _method;
    create table desired3 as
        select a.*, b.crsecollege, b.crsedesc
            from  crselist a, crseinfo b
                where       a.crsenme = b.crsenme 
                            and a.crsenum = b.crsenum 
                            and a.term ge b.crsefx
                    group by a.crsenme, a.crsenum, a.term, a.section
                        having b.crsefx = max(b.crsefx)
;
quit;

These produce the same result, but the latter one is more readily optimised into a hash join. 这些产生相同的结果,但后者更容易优化为哈希联接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM