简体   繁体   English

如何根据pl / pgsql中其他表的值替换所有字符子集?

[英]How to replace all subsets of characters based on values of other tables in pl/pgsql?

I've been doing some research on how to replace a subset of string of characters of a single row base on the values of the columns of other rows, but was not able to do so since the update are only for the first row values of the other table. 我一直在研究如何基于其他行的列值替换单个行的字符串的子集,但由于更新仅针对的第一行值而无法执行另一张桌子。 So I'm planning to insert this in a loop in a plpsql function. 因此,我计划将其插入到plpsql函数的循环中。

Here are the snippet of my tables. 这是我的桌子的摘录。 Main table: 主表:

 Table "public.tbl_main"
        Column         |  Type  | Modifiers 
-----------------------+--------+-----------
 maptarget             | text   | 
 expression            | text   | 


 maptarget |                 expression
-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 43194-0   | 363787002:70434600=(386053000:704347000=(414237002:704320005=259470008,704318007=118539007,704319004=50863008),704327008=122592007,246501002=703690001,370132008=30766002)

Look-up table: 查找表:

Table "public.tbl_values"
        Column         |  Type  | Modifiers 
-----------------------+--------+-----------
 conceptid             | bigint | 
 term                  | text   |

 conceptid |                   term                   
-----------+------------------------------------------
 386053000 | Patient evaluation procedure (procedure)
 363787002 | Observable entity (observable entity)
 704347000 | Observes (attribute)
 704320005 | Towards (attribute)
 704318007 | Property type (attribute)

I want to create a function that will replace all numeric values in the tbl_main.expression columns with their corresponding tbl_values.term using the tbl_values.conceptid as the link to each numeric values in the expression string. 我想创建一个函数,使用tbl_values.conceptid作为表达式字符串中每个数字值的链接,将tbl_main.expression列中的所有数字值替换为其对应的tbl_values.term

I'm stuck currently in the looping part since I'm a newbie in LOOP of plpgsql. 由于我是plpgsql LOOP的新手,因此我目前停留在循环部分。 Here is the rough draft of my function. 这是我职能的草稿。

--create first a test table
drop table if exists tbl_test;
create table tbl_test as select * from tbl_main limit 1;
--

create or replace function test () 
 RETURNS SETOF tbl_main
 LANGUAGE plpgsql
AS $function$
declare
 resultItem tbl_main;
 v_mapTarget text;
 v_expression text;
 ctr int;
begin
  v_mapTarget:='';
  v_expression:='';
  ctr:=1;

  for resultItem in (select * from tbl_test) loop
     v_mapTarget:=resultItem.mapTarget;
     select into v_expression expression from ee;
     raise notice 'parameter used: %',v_mapTarget;
     raise notice 'current expression: %',v_expression;

     update ee set expression=replace(v_expression, new_exp::text, term) from (select new_exp::text, term from tbl_values offset ctr limit 1) b ; 
     ctr:=ctr+1;
     raise notice 'counter: %', ctr;
     v_expression:= (select expression from ee);
     resultItem.expression:= v_expression;
     raise notice 'current expression: %',v_expression;
return next resultItem;
 end loop;
 return;
 end;
$function$;

Any further information will be much appreciated. 任何进一步的信息将不胜感激。
My Postgres version: 我的Postgres版本:

PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit x86_64-unknown-linux-gnu上的PostgreSQL 9.3.6,由gcc(Ubuntu 4.8.2-19ubuntu1)编译4.8.2,64位

PL/pgSQL function with dynamic SQL 具有动态SQL的PL / pgSQL函数

Looping is always a measure of last resort. 循环始终是最后的手段。 Even in this case it is substantially cheaper to concatenate a query string using a query, and execute it once : 即使在这种情况下,使用查询连接查询字符串并执行一次也要便宜得多:

CREATE OR REPLACE FUNCTION f_make_expression(_expr text, OUT result text) AS
$func$
BEGIN
   EXECUTE (
      SELECT 'SELECT ' || string_agg('replace(', '') || '$1,'
           || string_agg(format('%L,%L)', conceptid::text, v.term), ','
                         ORDER BY conceptid DESC)
      FROM  (
         SELECT conceptid::bigint
         FROM   regexp_split_to_table($1, '\D+') conceptid
         WHERE  conceptid <> ''
         ) m
      JOIN   tbl_values v USING (conceptid)
      )
   USING _expr
   INTO result;
END
$func$ LANGUAGE plpgsql;

Call: 呼叫:

SELECT *, f_make_expression(expression) FROM tbl_main;

However , if not all conceptid have the same number of digits, the operation could be ambiguous. 但是 ,如果不是所有的conceptid都具有相同的位数,则该操作可能会模棱两可。 Replace conceptid with more digits first to avoid that - ORDER BY conceptid DESC does that - and make sure that replacement strings do not introduce ambiguity (numbers that might be replaced in the the next step). 首先用更多数字替换conceptid以避免发生这种情况ORDER BY conceptid DESC会这样做-并确保替换字符串不会引起歧义(下一步可能会替换的数字)。 Related answer with more on these pitfalls: 有关这些陷阱的更多答案:

The token $1 is used two different ways here, don't be misled: 令牌$1在这里使用两种不同的方式,不要被误导:

  1.  regexp_split_to_table( $1 , '\\D+') 

    This one references the first function parameter _expr . 这个引用了第一个函数参数_expr You could as well use the parameter name. 您也可以使用参数名称。

  2.  || ' $1 ,' 

    This concatenates into the SQL string a references to the first expression passed via USING clause to EXECUTE . 这会将对通过USING子句传递给EXECUTE的第一个表达式的引用连接到SQL字符串中。 Parameters of the outer function are not visible inside EXECUTE , you have to pass them explicitly. 外部函数的参数在EXECUTE内部不可见,您必须显式传递它们。

It's pure coincidence that $1 ( _expr ) of the outer function is passed as $1 to EXECUTE . 纯粹的巧合是,外部函数的$1_expr )作为$1传递给EXECUTE Might as well hand over $7 as third expression in the USING clause ( $3 ) ... 最好在USING子句( $3 )中将$7作为第三个表达式移交...

I added a debug function to the fiddle. 我在小提琴中添加了调试功能。 With a minor modification you can output the generated SQL string to inspect it: 通过较小的修改,您可以输出生成的SQL字符串以对其进行检查:

SQL function SQL函数

Here is a pure SQL alternative. 这是一个纯SQL替代方法。 Probably also faster: 可能还会更快:

CREATE OR REPLACE FUNCTION f_make_expression_sql(_expr text)
  RETURNS text AS
$func$
SELECT string_agg(CASE WHEN $1 ~ '^\d'
                       THEN txt || COALESCE(v.term, t.conceptid) 
                       ELSE COALESCE(v.term, t.conceptid) || txt END
                , '' ORDER BY rn) AS result
FROM  (
   SELECT *, row_number() OVER () AS rn
   FROM  (
      SELECT regexp_split_to_table($1, '\D+') conceptid
           , regexp_split_to_table($1, '\d+') txt
      ) sub
   ) t
LEFT  JOIN tbl_values v ON v.conceptid = NULLIF(t.conceptid, '')::int
$func$  LANGUAGE sql STABLE;

In Postgres 9.4 this can be much more elegant with two new features: 在Postgres 9.4中,这可以通过两个新功能变得更加优雅:

CREATE OR REPLACE FUNCTION f_make_expression_sql(_expr text)
  RETURNS text AS
$func$
SELECT string_agg(CASE WHEN $1 ~ '^\d'
                       THEN txt || COALESCE(v.term, t.conceptid) 
                       ELSE COALESCE(v.term, t.conceptid) || txt END
                , '' ORDER BY rn) AS result
FROM   ROWS FROM (
          regexp_split_to_table($1, '\D+')
        , regexp_split_to_table($1, '\d+')
       ) WITH ORDINALITY AS t(conceptid, txt, rn)
LEFT   JOIN tbl_values v ON v.conceptid = NULLIF(t.conceptid, '')::int
$func$  LANGUAGE sql STABLE;

SQL Fiddle demonstrating all for Postgres 9.3. SQL Fiddle演示了Postgres 9.3的全部内容。

There's also another way, without creating functions... using "WITH RECURSIVE". 还有另一种方法,不用创建功能...使用“ WITH RECURSIVE”。 Used it with lookup talbe of thousands of rows. 与数千行查找表一起使用。

You'll need to change following table names and columns to your names: 您需要将以下表名称和列更改为您的名称:

tbl_main, strsourcetext, strreplacedtext; tbl_main,strsourcetext,strreplacedtext;

lookuptable, strreplacefrom, strreplaceto. 查找表,strreplacefrom,strreplaceto。

WITH RECURSIVE replaced AS (
(SELECT
strsourcetext,
strreplacedtext,
array_agg(strreplacefrom ORDER BY length(strreplacefrom) DESC, strreplacefrom, strreplaceto) AS arrreplacefrom,
array_agg(strreplaceto ORDER BY length(strreplacefrom) DESC, strreplacefrom, strreplaceto) AS arrreplaceto,
count(1) AS intcount,
1 AS intindex
FROM tbl_main, lookuptable WHERE tbl_main.strsourcetext LIKE '%' || strreplacefrom || '%'
GROUP BY strsourcetext)
UNION ALL 
SELECT
strsourcetext,
replace(strreplacedtext, arrreplacefrom[intindex], arrreplaceto[intindex]) AS strreplacedtext,
arrreplacefrom,
arrreplaceto,
intcount,
intindex+1 AS intindex
FROM replaced WHERE intindex<=intcount
)
SELECT strsourcetext,
(array_agg(strreplacedtext ORDER BY intindex DESC))[1] AS strreplacedtext
FROM replaced 
GROUP BY strsourcetext

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM