繁体   English   中英

Teradata 字符串/文本搜索

[英]Teradata string/text search

我在teradata中有2个表。 表A和表B

表A有2列,如下所示

表A

第 1 列 --> 数据库名称

第 2 列 --> 表名

TableB 有多个列,包括一个文本列。

我想从表 B 的文本列中的表 A 中搜索 databasename.tablename。 不能使用 like 运算符,因为表 A 中大约有 2000 个不同的表名。我尝试使用位置连接来执行此操作,如下所示,但查询运行时间非常长,PJI 高,我不得不手动中止它

select distinct a.Tablename ,b.text
from TableA a
inner join TableB b
on position(Trim(b.Text) in Trim('a.Databasename.'||a.tablename))>0
where b.theDate between add_months(date,-6) and date

UNION ALL

select distinct a.Tablename ,b.text
from TableA a
inner join TableB b
on position (Trim('a.Databasename.'||a.tablename) in Trim(b.Text))  >0
where b.theDate between add_months(date,-6) and date;

是否有其他方法可以进行上述字符串搜索。 请分享SQL。

谢谢

REGEXP_SIMILAR:

一种选择是使用REGEXP_SIMILAR() ,它比LIKE更精确。 我不确定是否会更快,但值得一试:

CREATE MULTISET VOLATILE TABLE TABLEA 
(databasename varchar(30), tablename varchar(30)) 
PRIMARY INDEX (databasename, tablename) ON COMMIT PRESERVE ROWS;

INSERT INTO TABLEA VALUES ('dba','tbla');
INSERT INTO TABLEA VALUES ('dba','tblb');
INSERT INTO TABLEA VALUES ('dbb','tbla');

CREATE MULTISET VOLATILE TABLE TABLEB 
(id int, sqlqry VARCHAR(5000)) 
ON COMMIT PRESERVE ROWS;

INSERT INTO TABLEB VALUES (1, 'SELECT * FROM dba.tbla;');
INSERT INTO TABLEB VALUES (2, 'SELECT smoecolumn FROM dba.tblb INNER JOIN dba.tbla ON foo = bar WHERe 1=1;');
INSERT INTO TABLEB VALUES (3, 'SELECT * FROM dbb.tbla WHERE foo=bar');

SELECT *
FROM TABLEA
    INNER JOIN TABLEB
        ON REGEXP_SIMILAR(TABLEB.sqlqry, '^.*' || TABLEA.databasename || '\.' || TABLEA.tablename || '.*$', 'i') = 1;

+-----+------+---+-----------------------------------------------------------------------------+
| dbb | tbla | 3 | SELECT * FROM dbb.tbla WHERE foo=bar                                        |
| dba | tbla | 2 | SELECT smoecolumn FROM dba.tblb INNER JOIN dba.tbla ON foo = bar WHERe 1=1; |
| dba | tbla | 1 | SELECT * FROM dba.tbla;                                                     |
| dba | tblb | 2 | SELECT smoecolumn FROM dba.tblb INNER JOIN dba.tbla ON foo = bar WHERe 1=1; |
+-----+------+---+-----------------------------------------------------------------------------+

STRTOK_SPLIT_TO_TABLE:

这就是我用strtok_split_to_tables评论瞄准的地方。 基本上,您将TABLEB的 sql TABLEB为单词(按空格和;字符拆分)。 这将为每个单词生成一行。

从该列表中,您只需保留包含句点的单词(例如databasename.tablename )。

然后你可以在 TABLEB 和 TABLEA 之间进行连接:

CREATE MULTISET VOLATILE TABLE TABLEA 
(databasename varchar(30), tablename varchar(30)) 
PRIMARY INDEX (databasename, tablename) ON COMMIT PRESERVE ROWS;

INSERT INTO TABLEA VALUES ('dba','tbla');
INSERT INTO TABLEA VALUES ('dba','tblb');
INSERT INTO TABLEA VALUES ('dbb','tbla');

CREATE MULTISET VOLATILE TABLE TABLEB 
(id int, sqlqry VARCHAR(5000)) 
ON COMMIT PRESERVE ROWS;

INSERT INTO TABLEB VALUES (1, 'SELECT * FROM dba.tbla;');
INSERT INTO TABLEB VALUES (2, 'SELECT smoecolumn FROM dba.tblb INNER JOIN dba.tbla ON foo = bar WHERe 1=1;');
INSERT INTO TABLEB VALUES (3, 'SELECT * FROM dbb.tbla WHERE foo=bar');

WITH sqlwords AS
(
    SELECT tablebid, sqlwordnum, sqlword
    FROM TABLE (strtok_split_to_table(TABLEB.id, TABLEB.sqlqry, ' ;')
    RETURNS (tablebid integer, sqlwordnum integer, sqlword varchar(100)character set unicode) ) as sqlwordsplitter
    WHERE sqlwordsplitter.sqlword like '%.%'
)
SELECT TABLEA.*, TABLEB.*
FROM TABLEA
    INNER JOIN sqlwords
        ON TABLEA.databasename = strtok(sqlwords.sqlword, '.', 1)
            AND TABLEA.tablename = strtok(sqlwords.sqlword, '.', 2)
    INNER JOIN TABLEB
        ON sqlwords.tablebid = TABLEB.id;


+-----+------+---+-----------------------------------------------------------------------------+
| dbb | tbla | 3 | SELECT * FROM dbb.tbla WHERE foo=bar                                        |
| dba | tbla | 2 | SELECT smoecolumn FROM dba.tblb INNER JOIN dba.tbla ON foo = bar WHERe 1=1; |
| dba | tbla | 1 | SELECT * FROM dba.tbla;                                                     |
| dba | tblb | 2 | SELECT smoecolumn FROM dba.tblb INNER JOIN dba.tbla ON foo = bar WHERe 1=1; |
+-----+------+---+-----------------------------------------------------------------------------+

这不会很快,因为我们必须进行分词,但它肯定会完成工作。

如果是为了从 CREATE TABLE AS 中提取单个表名,您可以对表/数据库名应用正则表达式:

RegExp_Substr(SqlTextInfo, 'AS\s+?(.*?[.])?\K.+?\s+?(?=WITH\s)',1,1,'i') AS TableName
RegExp_Substr(SqlTextInfo, 'AS\s+?\K.*?(?=[.](.+?\s+)?WITH\s)',1,1,'i') AS DatabaseName

如果缺少数据库名称,您可以 COALESCE QryLogV.DefaultDatabase

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM