[英]Using pattern matching to outer join tables in Oracle SQL
I'm creating a DataModel in Oracle Fusion Financials to match parties together, from supplier use and customer use.我正在 Oracle Fusion Financials 中创建一个数据模型,以将供应商使用和客户使用的各方匹配在一起。 These parties have a code which is registered in their name.这些各方都有一个以他们的名义注册的代码。 Searching for the table names on Google will find the schemas (eg. HZ_PARTIES ), although it's not very necessary to see the schemas to tackle this issue.在 Google 上搜索表名会找到模式(例如HZ_PARTIES ),尽管查看模式来解决这个问题并不是很有必要。
Our data quality is not quite what we want it to be.我们的数据质量并不是我们想要的那样。 To ensure I'm not missing records, I need to join on other parties who also have the code in the name.为确保我不会丢失记录,我需要加入名称中也包含代码的其他方。
This is what I have so far, which gives results.这就是我到目前为止所拥有的,它给出了结果。
SELECT
RCTA.TRX_NUMBER
,RCTA.CT_REFERENCE
,HP.PARTY_NAME PARTY_NAME1
,HP2.PARTY_NAME PARTY_NAME2
,IEBC.IBAN CUSTOMER_IBAN
FROM
HZ_PARTIES HP,
HZ_PARTIES HP2,
IBY_ACCOUNT_OWNERS IAO,
IBY_EXT_BANK_ACCOUNTS IEBC,
RA_CUSTOMER_TRX_ALL RCTA,
HZ_CUST_ACCOUNTS HCA
WHERE 1=1
AND RCTA.BILL_TO_CUSTOMER_ID = HCA.CUST_ACCOUNT_ID (+)
AND HCA.PARTY_ID = HP.PARTY_ID(+)
AND REGEXP_SUBSTR(HP.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') in REGEXP_SUBSTR(HP2.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') -- Join on code found in party name.
AND IAO.ACCOUNT_OWNER_PARTY_ID (+) IN (HP2.PARTY_ID)
AND IAO.EXT_BANK_ACCOUNT_ID = IEBC.EXT_BANK_ACCOUNT_ID (+)
However, this performs an inner join instead of the outer join I need.但是,这将执行内部连接而不是我需要的外部连接。
I've tried the following, which gives a syntax error (missing parenthesis):我尝试了以下方法,但会出现语法错误(缺少括号):
AND REGEXP_SUBSTR(HP.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') = REGEXP_SUBSTR(HP2.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') (+)
Also tried this, which makes the query run for way too long.也试过这个,这使得查询运行时间太长。 Did not wait for results, because it's probably incorrect:没有等待结果,因为它可能是不正确的:
AND ( REGEXP_SUBSTR(HP.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') = REGEXP_SUBSTR(HP2.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') (+) -- Join on investor code found in party name.
OR NOT REGEXP_LIKE(HP.PARTY_NAME,'([0-9]{2}[A-Z]{2}[0-9]{3})') -- Escape to outer join in case there's no investor code in name
)
If it's necessary to make this work I'm willing to rewrite the (+) joins to regular outer join syntax.如果有必要完成这项工作,我愿意将 (+) 连接重写为常规外连接语法。
You put outer join operator (+)
to a wrong place.您将外连接运算符(+)
放在错误的位置。 Should be something like this:应该是这样的:
SQL> with
2 hp (party_name) as
3 (select '11AA111' from dual union all
4 select '22BB222' from dual
5 ),
6 hp2 (party_name) as
7 (select '11AA111' from dual union all
8 select '33CC333' from dual
9 )
10 select hp.*
11 from hp, hp2
12 where regexp_substr(hp.party_name , '([0-9]{2}[A-Z]{2}[0-9]{3})') =
13 regexp_substr(hp2.party_name (+), '([0-9]{2}[A-Z]{2}[0-9]{3})')
14 / ---
here
PARTY_N
-------
11AA111
22BB222
SQL>
As of proper joins ... well, yes - you could rewrite it if you want, but I don't think it'll help in this case.至于适当的连接......好吧,是的 - 如果你愿意,你可以重写它,但我认为在这种情况下它不会有帮助。 If query runs OK as is , I'd leave it as is and rewrite it if necessary.如果查询按原样运行正常,我会保持原样并在必要时重写它。
I suggest you add a virtual column to the hz_parties
table and index it, if you are allowed to:我建议你在hz_parties
表中添加一个虚拟列并索引它,如果你被允许:
alter table hz_parties add code varchar2(7) as regexp_substr(party_name, '([0-9]{2}[A-Z]{2}[0-9]{3})');
create index idx_parties_code on hz_parties (code);
If you are not allowed to alter the table, then use a function index instead:如果不允许更改表,请改用函数索引:
create index idx_parties_code on hz_parties (regexp_substr(party_name, '([0-9]{2}[A-Z]{2}[0-9]{3})'));
If you are not allowed to add an index on an existing table, then create a new table with an index, eg:如果不允许在现有表上添加索引,则创建一个带有索引的新表,例如:
create table party_code
(
party_id number(10) not null,
code varchar2(7) not null,
primary key (party_id)
);
insert into party_code (party_id, code)
select party_id, regexp_substr(party_name, '([0-9]{2}[A-Z]{2}[0-9]{3})')
from hz_parties;
create index idx_party_code on party_code (code, party_id);
In any of these cases you have the code pre-extracted and the join should be fast.在任何这些情况下,您都预先提取了代码,并且连接应该很快。
In order to find duplicates just group by code.为了找到重复项,只需按代码分组。 Eg:例如:
select code, listagg(party_id, ', ') within group (order by party_id)
from party_code
group by code
having count(*) > 1;
Re-write your query to use explicit joins anyway to get it readable, fix the erroneous outer joins and spot possible other mistakes.重新编写查询以使用显式连接以使其可读,修复错误的外部连接并发现可能的其他错误。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.