简体   繁体   English

SQL 使用 NOT 条件加入映射表

[英]SQL Join with mapping table with NOT condition

There is a dimension table like this:有一个这样的维度表:

id ID column_a column_a column_b column_b
1 1 val_a1 val_a1 val_b1 val_b1
2 2 val_a1 val_a1 val_b2 val_b2
3 3 val_a2 val_a2 val_b2 val_b2
4 4 val_a2 val_a2 val_b3 val_b3
5 5 val_a2 val_a2 val_b1 val_b1

I am creating a mapping table to add a new column to the dimension like column_x, here:我正在创建一个映射表,以便将一个新列添加到像 column_x 这样的维度,这里:

column_a column_a column_b column_b column_x column_x
val_a1 val_a1 'Any-Value' '任何值' val_x1 val_x1
val_a2 val_a2 val_b2 val_b2 val_x2 val_x2
val_a2 val_a2 'not val_b2' '不是 val_b2' val_x3 val_x3

How to read the mapping table:如何读取映射表:

This means for all the rows with vol_a1 in column_a and any value in column_b: column_x will have val_x1.这意味着对于 column_a 中具有 vol_a1 的所有行以及 column_b 中的任何值:column_x 将具有 val_x1。

This means for the rows with vol_a2 in column_a and specifically val_b2 in column_b: column_x will have val_x2.这意味着对于 column_a 中具有 vol_a2 的行,特别是 column_b 中具有 val_b2 的行:column_x 将具有 val_x2。

This means for the rows with vol_a2 in column_a and specifically NOT val_b2 in column_b: column_x will have val_x3.这意味着对于 column_a 中具有 vol_a2 的行,特别是 column_b 中没有val_b2 的行:column_x 将具有 val_x3。

The output will look like: output 将如下所示:

id ID column_a column_a column_b column_b column_x column_x
1 1 val_a1 val_a1 val_b1 val_b1 val_x1 val_x1
2 2 val_a1 val_a1 val_b2 val_b2 val_x1 val_x1
3 3 val_a2 val_a2 val_b2 val_b2 val_x2 val_x2
4 4 val_a2 val_a2 val_b3 val_b3 val_x3 val_x3
4 4 val_a2 val_a2 val_b1 val_b1 val_x3 val_x3

Can I do it in a single join?我可以一次加入吗? I can obviously break it in multiple CTEs and can do it.我显然可以在多个 CTE 中打破它并且可以做到。 What would the most optimal way to do it, if the dimension table is big?如果维度表很大,最好的方法是什么?

I thought to change the mapping table into this and add priorities with the help of IF conditions in the ON clause with OR Conditions:我想将映射表更改为此并在带有OR条件的ON子句中的IF条件的帮助下添加优先级:

column_a column_a column_b column_b column_x column_x
val_a1 val_a1 'Any-Value' '任何值' val_x1 val_x1
val_a2 val_a2 val_b2 val_b2 val_x2 val_x2
val_a2 val_a2 'Any-Value' '任何值' val_x3 val_x3

But it's giving me duplicates.但它给了我重复。

I can use SLQ or python to do this.我可以使用 SLQ 或 python 来执行此操作。

This might do.这可能会。 Pls.请。 note that there might be more than one mapping record joined to certain dimension records.请注意,可能有多个映射记录连接到某些维度记录。 Having OR in join conditions is not very performant btw.顺便说一句,在连接条件中有 OR 的性能不是很好。

select t1.id, t1.column_a, t1.column_b, t2.column_x
from _dimension t1 left outer join _mapping t2
on
  t1.column_a = t2.column_a and t2.column_b = 'Any-Value' OR
  t1.column_a = t2.column_a and t1.column_b = t2.column_b OR
  t1.column_a = t2.column_a and t2.column_b ~* '^not ' 
   and t1.column_b <> regex_replace(t2.column_b, '^not (.+)$', '\1', 'i');

About priorities, I would suggest that you add an explicit priority_level integer column in your mapping table.关于优先级,我建议您在映射表中添加一个明确的priority_level integer列。 Then the query will look like this:然后查询将如下所示:

select distinct on (t1.id)
    t1.id, t1.column_a, t1.column_b, t2.column_x
from _dimension t1 left outer join _mapping t2
on
  t1.column_a = t2.column_a and t2.column_b = 'Any-Value' OR
  t1.column_a = t2.column_a and t1.column_b = t2.column_b OR
  t1.column_a = t2.column_a and t2.column_b ~* '^not '
   and t1.column_b <> regex_replace(t2.column_b, '^not (.+)$', '\1', 'i');
order by t1.id, t2.priority_level;

It may be a good idea to use null instead of 'Any-Value' and '!val_b' instead of 'not val_b' in the mapping table for simplicity.为简单起见,在映射表中使用null代替'Any-Value''!val_b'代替'not val_b'可能是个好主意。

Can you use a case when statement instead of a join?您可以使用 case when 语句而不是 join 吗? So your query / view would be:所以您的查询/视图将是:

Select id,  
column_a,  
column_b,  
case when column_a = 'val_a1' then 'val_x1' 
when column_a = 'val_a2' and column_b = 'val_b2' then 'val_x2' 
when column_a = 'val_a2' and column_b <>'val_b2' then 'val_x3' 
else -- add your other conditions here 
end as column_x from table

In case you need to join, you can do so with a single join on SQL server using case when and an additional column on your mapping table for the values of column_b that you want to exclude..如果您需要加入,您可以在 SQL 服务器上使用 case when 和一个附加列在您的映射表中为您要排除的 column_b 的值添加一个附加列。

Let's say this is your new mapping table:假设这是您的新映射表:

| column_a | column_b | column_b_exclude | column_x |
-----------|----------|------------------|----------|
| val_a1   | NULL     | NULL             | val_x1   |
| val_a2   | val_b2   | NULL             | val_x2   |
| val_a2   | NULL     | val_b2           | val_x3   |

Your join would look like你的加入看起来像

select * from 
table t 
inner join mapping m 
on t.column_a = m.column_a and 
case when m.column_b is null then t.column_b else m.column_b end = t.column_b and case when m.column_b_exclude is not null then m.column_b_exclude else 'x' end <> t.column_b

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM