簡體   English   中英

SAS或Postgresql:根據另一個列值添加列值

[英]SAS or Postgresql:add column value according to another column value

我想添加一列“索引”,並且當“變量”列中的行具有相同的值時,“索引”列的值相同。您可以使用Postgresql或SAS語法。

一件事是“變量”列中的值每天都在變化,例如tableA和tableB,所以硬代碼是不可接受的。任何建議都值得贊賞!

+----------+--------------+------+-------+-----+-----+-------+
| variable | new_variable | type | start | end | woe | index |
+----------+--------------+------+-------+-----+-----+-------+
| A        | mi_A         | char | 1     |     | 1.3 |     1 |
| A        | mi_A         | char | 0     |     | 0.6 |     1 |
| B        | mi_B         | char | 1     |     | 5.4 |     2 |
| B        | mi_B         | char | 0     |     | 0.1 |     2 |
| gnd_cd   | gnd_cd       | char | 3     |     | 1.3 |     3 |
| gnd_cd   | gnd_cd       | char | @0    |     | 0.6 |     3 |
| gnd_cd   | gnd_cd       | char | 2     |     | 5.4 |     3 |
| gnd_cd   | gnd_cd       | char | N     |     | 0.1 |     3 |
| gnd_cd   | gnd_cd       | char | 1     |     | 1.3 |     3 |
| gnd_cd   | gnd_cd       | char | 99    |     | 0.6 |     3 |
| mar_sign | mar_sign     | char | 0     |     | 5.4 |     4 |
| mar_sign | mar_sign     | char | Y     |     | 0.1 |     4 |
| mar_sign | mar_sign     | char | N     |     |   6 |     4 |
| C        | C            | char | 6     |     |   2 |     5 |
| C        | C            | char | 7     |     | 2.1 |     5 |
| C        | C            | char | 8     |     | 2.2 |     5 |
+----------+--------------+------+-------+-----+-----+-------+
                         (tableA)

+--------------+--------------+------+-------+-----+-----+-------+
|   variable   | new_variable | type | start | end | woe | index |
+--------------+--------------+------+-------+-----+-----+-------+
| D            | mi_D         | char | 1     |     |   1 |     1 |
| D            | mi_D         | char | 0     |     |   2 |     1 |
| E            | mi_E         | char | 1     |     |   2 |     2 |
| E            | mi_E         | char | 0     |     |   3 |     2 |
| education_bg | education_bg | char | 3     |     |   1 |     3 |
| education_bg | education_bg | char | @0    |     |   5 |     3 |
| education_bg | education_bg | char | 2     |     |   6 |     3 |
| education_bg | education_bg | char | N     |     |   4 |     3 |
| education_bg | education_bg | char | 1     |     |   3 |     3 |
| education_bg | education_bg | char | 99    |     |   3 |     3 |
| sex          | sex          | char | 0     |     |   2 |     4 |
| sex          | sex          | char | Y     |     |   1 |     4 |
| sex          | sex          | char | N     |     |   0 |     4 |
| C            | C            | char | 6     |     |   6 |     5 |
| C            | C            | char | 7     |     |   4 |     5 |
| C            | C            | char | 8     |     |   1 |     5 |
+--------------+--------------+------+-------+-----+-----+-------+
                             (tableB)

您可以使用“保留功能”並按變量分組,在單個數據步驟中在SAS中執行此操作。

碼:

data have;
infile datalines dlm='|';
input variable $ new_variable $ type $ start $  end $ woe ;
datalines;
| A        | mi_A         | char | 1     |     | 1.3 
| A        | mi_A         | char | 0     |     | 0.6 
| B        | mi_B         | char | 1     |     | 5.4 
| B        | mi_B         | char | 0     |     | 0.1 
| gnd_cd   | gnd_cd       | char | 3     |     | 1.3 
| gnd_cd   | gnd_cd       | char | @0    |     | 0.6 
| gnd_cd   | gnd_cd       | char | 2     |     | 5.4 
| gnd_cd   | gnd_cd       | char | N     |     | 0.1 
| gnd_cd   | gnd_cd       | char | 1     |     | 1.3 
| gnd_cd   | gnd_cd       | char | 99    |     | 0.6 
| mar_sign | mar_sign     | char | 0     |     | 5.4 
| mar_sign | mar_sign     | char | Y     |     | 0.1 
| mar_sign | mar_sign     | char | N     |     |   6 
| C        | C            | char | 6     |     |   2 
| C        | C            | char | 7     |     | 2.1 
| C        | C            | char | 8     |     | 2.2  
run;

data want;
set have ;
by variable notsorted;
retain index;
if first.variable then index+1;
run;

注意:我創建了索引,並且僅使用新的組值來增加其值。

您可以創建一個類似於以下內容的新表:

    select distinct variable, montonic() as newindex from mydata order by index;

然后將其重新連接到原始表。 實際上,您可以一步完成所有操作:

    select a.variable, a.new_variable, a.type, a.start, a.end, a.woe, b.newindex as index from mydata as a left join (select distinct variable, montonic() as newindex from my table order by index) as b on  a.variable=b.variable;

或類似的東西。 我不能說我100%理解您想要達到的目標,但這也許會有所幫助。

請注意,SAS中的monotonic()函數仍(我認為)尚未記錄。 這意味着SAS可能會也可能不會繼續包含它。 它可以工作,但是也許他們認為它是實驗性的。

考慮使用Postgres的dense_rank窗口函數進行無間隙排名:

SELECT *, DENSE_RANK() OVER (ORDER BY variable) as "index"
FROM mytable

Rextester演示 (帶有隨機種子數據)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM