[英]How to replace all subsets of characters based on values of other tables in pl/pgsql?
I've been doing some research on how to replace a subset of string of characters of a single row base on the values of the columns of other rows, but was not able to do so since the update are only for the first row values of the other table. 我一直在研究如何基于其他行的列值替换单个行的字符串的子集,但由于更新仅针对的第一行值而无法执行另一张桌子。 So I'm planning to insert this in a loop in a plpsql function.
因此,我计划将其插入到plpsql函数的循环中。
Here are the snippet of my tables. 这是我的桌子的摘录。 Main table:
主表:
Table "public.tbl_main"
Column | Type | Modifiers
-----------------------+--------+-----------
maptarget | text |
expression | text |
maptarget | expression
-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
43194-0 | 363787002:70434600=(386053000:704347000=(414237002:704320005=259470008,704318007=118539007,704319004=50863008),704327008=122592007,246501002=703690001,370132008=30766002)
Look-up table: 查找表:
Table "public.tbl_values"
Column | Type | Modifiers
-----------------------+--------+-----------
conceptid | bigint |
term | text |
conceptid | term
-----------+------------------------------------------
386053000 | Patient evaluation procedure (procedure)
363787002 | Observable entity (observable entity)
704347000 | Observes (attribute)
704320005 | Towards (attribute)
704318007 | Property type (attribute)
I want to create a function that will replace all numeric values in the tbl_main.expression
columns with their corresponding tbl_values.term
using the tbl_values.conceptid
as the link to each numeric values in the expression string. 我想创建一个函数,使用
tbl_values.conceptid
作为表达式字符串中每个数字值的链接,将tbl_main.expression
列中的所有数字值替换为其对应的tbl_values.term
。
I'm stuck currently in the looping part since I'm a newbie in LOOP
of plpgsql. 由于我是plpgsql
LOOP
的新手,因此我目前停留在循环部分。 Here is the rough draft of my function. 这是我职能的草稿。
--create first a test table
drop table if exists tbl_test;
create table tbl_test as select * from tbl_main limit 1;
--
create or replace function test ()
RETURNS SETOF tbl_main
LANGUAGE plpgsql
AS $function$
declare
resultItem tbl_main;
v_mapTarget text;
v_expression text;
ctr int;
begin
v_mapTarget:='';
v_expression:='';
ctr:=1;
for resultItem in (select * from tbl_test) loop
v_mapTarget:=resultItem.mapTarget;
select into v_expression expression from ee;
raise notice 'parameter used: %',v_mapTarget;
raise notice 'current expression: %',v_expression;
update ee set expression=replace(v_expression, new_exp::text, term) from (select new_exp::text, term from tbl_values offset ctr limit 1) b ;
ctr:=ctr+1;
raise notice 'counter: %', ctr;
v_expression:= (select expression from ee);
resultItem.expression:= v_expression;
raise notice 'current expression: %',v_expression;
return next resultItem;
end loop;
return;
end;
$function$;
Any further information will be much appreciated. 任何进一步的信息将不胜感激。
My Postgres version: 我的Postgres版本:
PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit
x86_64-unknown-linux-gnu上的PostgreSQL 9.3.6,由gcc(Ubuntu 4.8.2-19ubuntu1)编译4.8.2,64位
Looping is always a measure of last resort. 循环始终是最后的手段。 Even in this case it is substantially cheaper to concatenate a query string using a query, and execute it once :
即使在这种情况下,使用查询连接查询字符串并执行一次也要便宜得多:
CREATE OR REPLACE FUNCTION f_make_expression(_expr text, OUT result text) AS
$func$
BEGIN
EXECUTE (
SELECT 'SELECT ' || string_agg('replace(', '') || '$1,'
|| string_agg(format('%L,%L)', conceptid::text, v.term), ','
ORDER BY conceptid DESC)
FROM (
SELECT conceptid::bigint
FROM regexp_split_to_table($1, '\D+') conceptid
WHERE conceptid <> ''
) m
JOIN tbl_values v USING (conceptid)
)
USING _expr
INTO result;
END
$func$ LANGUAGE plpgsql;
Call: 呼叫:
SELECT *, f_make_expression(expression) FROM tbl_main;
However , if not all conceptid
have the same number of digits, the operation could be ambiguous. 但是 ,如果不是所有的
conceptid
都具有相同的位数,则该操作可能会模棱两可。 Replace conceptid
with more digits first to avoid that - ORDER BY conceptid DESC
does that - and make sure that replacement strings do not introduce ambiguity (numbers that might be replaced in the the next step). 首先用更多数字替换
conceptid
以避免发生这种情况ORDER BY conceptid DESC
会这样做-并确保替换字符串不会引起歧义(下一步可能会替换的数字)。 Related answer with more on these pitfalls: 有关这些陷阱的更多答案:
The token $1
is used two different ways here, don't be misled: 令牌
$1
在这里使用两种不同的方式,不要被误导:
regexp_split_to_table(
$1 , '\\D+')
This one references the first function parameter _expr
. 这个引用了第一个函数参数
_expr
。 You could as well use the parameter name. 您也可以使用参数名称。
|| '
$1 ,'
This concatenates into the SQL string a references to the first expression passed via USING
clause to EXECUTE
. 这会将对通过
USING
子句传递给EXECUTE
的第一个表达式的引用连接到SQL字符串中。 Parameters of the outer function are not visible inside EXECUTE
, you have to pass them explicitly. 外部函数的参数在
EXECUTE
内部不可见,您必须显式传递它们。
It's pure coincidence that $1
( _expr
) of the outer function is passed as $1
to EXECUTE
. 纯粹的巧合是,外部函数的
$1
( _expr
)作为$1
传递给EXECUTE
。 Might as well hand over $7
as third expression in the USING
clause ( $3
) ... 最好在
USING
子句( $3
)中将$7
作为第三个表达式移交...
I added a debug function to the fiddle. 我在小提琴中添加了调试功能。 With a minor modification you can output the generated SQL string to inspect it:
通过较小的修改,您可以输出生成的SQL字符串以对其进行检查:
Here is a pure SQL alternative. 这是一个纯SQL替代方法。 Probably also faster:
可能还会更快:
CREATE OR REPLACE FUNCTION f_make_expression_sql(_expr text)
RETURNS text AS
$func$
SELECT string_agg(CASE WHEN $1 ~ '^\d'
THEN txt || COALESCE(v.term, t.conceptid)
ELSE COALESCE(v.term, t.conceptid) || txt END
, '' ORDER BY rn) AS result
FROM (
SELECT *, row_number() OVER () AS rn
FROM (
SELECT regexp_split_to_table($1, '\D+') conceptid
, regexp_split_to_table($1, '\d+') txt
) sub
) t
LEFT JOIN tbl_values v ON v.conceptid = NULLIF(t.conceptid, '')::int
$func$ LANGUAGE sql STABLE;
In Postgres 9.4 this can be much more elegant with two new features: 在Postgres 9.4中,这可以通过两个新功能变得更加优雅:
ROWS FROM
to replacing the old (weird) technique to sync set-returning functions ROWS FROM
到替换旧的(怪异的)技术以同步集返回功能 WITH ORDINALITY
to get row numbers on the fly reliably : WITH ORDINALITY
可靠地 WITH ORDINALITY
获取行号:
CREATE OR REPLACE FUNCTION f_make_expression_sql(_expr text)
RETURNS text AS
$func$
SELECT string_agg(CASE WHEN $1 ~ '^\d'
THEN txt || COALESCE(v.term, t.conceptid)
ELSE COALESCE(v.term, t.conceptid) || txt END
, '' ORDER BY rn) AS result
FROM ROWS FROM (
regexp_split_to_table($1, '\D+')
, regexp_split_to_table($1, '\d+')
) WITH ORDINALITY AS t(conceptid, txt, rn)
LEFT JOIN tbl_values v ON v.conceptid = NULLIF(t.conceptid, '')::int
$func$ LANGUAGE sql STABLE;
SQL Fiddle demonstrating all for Postgres 9.3. SQL Fiddle演示了Postgres 9.3的全部内容。
There's also another way, without creating functions... using "WITH RECURSIVE". 还有另一种方法,不用创建功能...使用“ WITH RECURSIVE”。 Used it with lookup talbe of thousands of rows.
与数千行查找表一起使用。
You'll need to change following table names and columns to your names: 您需要将以下表名称和列更改为您的名称:
tbl_main, strsourcetext, strreplacedtext; tbl_main,strsourcetext,strreplacedtext;
lookuptable, strreplacefrom, strreplaceto. 查找表,strreplacefrom,strreplaceto。
WITH RECURSIVE replaced AS (
(SELECT
strsourcetext,
strreplacedtext,
array_agg(strreplacefrom ORDER BY length(strreplacefrom) DESC, strreplacefrom, strreplaceto) AS arrreplacefrom,
array_agg(strreplaceto ORDER BY length(strreplacefrom) DESC, strreplacefrom, strreplaceto) AS arrreplaceto,
count(1) AS intcount,
1 AS intindex
FROM tbl_main, lookuptable WHERE tbl_main.strsourcetext LIKE '%' || strreplacefrom || '%'
GROUP BY strsourcetext)
UNION ALL
SELECT
strsourcetext,
replace(strreplacedtext, arrreplacefrom[intindex], arrreplaceto[intindex]) AS strreplacedtext,
arrreplacefrom,
arrreplaceto,
intcount,
intindex+1 AS intindex
FROM replaced WHERE intindex<=intcount
)
SELECT strsourcetext,
(array_agg(strreplacedtext ORDER BY intindex DESC))[1] AS strreplacedtext
FROM replaced
GROUP BY strsourcetext
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.