简体   繁体   English

特定字符的Postgresql更新列功能

[英]Postgresql Update Column Function From Specific Characters

I want to make a function which does update the column and replace specific values from column. 我想做一个功能来更新列并替换列中的特定值。

For example i have a column like 例如我有一个像

'187-2026-00130740 167' '187-2026-00130740 167'

The first number 187 and the last number 167 is random numbers. 第一个数字187和最后一个数字167是随机数。 With function i take them with declared variables. 通过功能,我将它们与声明的变量一起使用。 Then going to update and remove them from column. 然后将其更新并从列中删除它们。

The _colid gets all id's from table as array. _colid从表中获取所有ID作为数组。 Then foreach id's in that array i get the first and last number. 然后foreach ID在该数组中,我得到第一个和最后一个数字。 After that i updated these numbers with blank. 之后,我用空白更新了这些数字。

My function is basically looks like this. 我的功能基本上是这样的。

CREATE OR REPLACE FUNCTION fix() RETURNS integer as $$
Declare _colid varchar[] := (SELECT '{{' || string_agg(concat_ws(',',id), '},{') || '}}' as a
FROM News n);
Declare _newsslice varchar[];
Declare _firstnumber varchar;
Declare _lastnumber varchar;
BEGIN
FOREACH _newsslice SLICE 1 IN ARRAY _colid 
LOOP
_firstnumber := (select regexp_replace(substring(split_part(token, '-', 1)from '[0-9]+'), '\s+$', '') AS x FROM news where id = _newsslice[1]::int and length(split_part(token, '-', 1)) < 4);

_lastnumber :=(select regexp_replace(substring(regexp_replace(token, '^. ', '')from '[0-9]+'), '\s+$', ' ') from news where id = _newsslice[1]::int and length(regexp_replace(substring(regexp_replace(token, '^. ', '')from '[0-9]+'), '\s+$', ' ')) < 4);

IF(_firstnumber IS NOT NULL) THEN
  update news set token = regexp_replace(regexp_replace(regexp_replace(trim(from token, ' ', ''),_colnumber,''),_firstnumber , '') where id = _newsslice[1]::int;
END IF;
END LOOP;
RETURN 1;
END; $$
LANGUAGE plpgsql;

This does update and delete the first number as i want. 这确实会更新和删除我想要的第一个数字。 But did not delete the last number. 但是没有删除最后一个号码。 What should i do, how do i approach to this? 我该怎么办,我该如何处理?

This is a problem that is better solved using SQL, rather than a function. 使用SQL而不是函数可以更好地解决此问题。

update news
set
    token = substring(token from position('-' in token) + 1)
where
    position('-' in token) <= 4;

update news
set
    token = substring(token from 1 for (length(token) - 4 + position(' ' in substring(token from length(token) - 3)))) 
where
    position(' ' in substring(token from length(token) - 3)) > 0;

This solution assumes that: 该解决方案假定:

  • token may have a prefix or a suffix or both or neither. 令牌可以具有前缀或后缀,也可以既具有前缀又具有后缀。
  • the prefix is up to 3 digits followed by a hyphen 前缀最多3位,后跟一个连字符
  • the suffix is a space followed by up to 3 digits 后缀是一个空格,后跟最多3位数字

I get this from your description and the code. 我是从您的描述和代码中得到的。 The actual requirement may be simpler, if for example all prefixes are exactly 3 digits. 实际要求可能会更简单,例如,如果所有前缀都恰好是3位数字。

If the prefix and suffix have to be checked to be numeric, that can be added, still using SQL. 如果必须检查前缀和后缀为数字,则可以使用SQL添加该前缀和后缀。

The SQL solution is better because: SQL解决方案更好,因为:

  • It is shorter and simpler 它更短更简单
  • It avoids the looping boundary conditions that we all make mistakes with (me too) 它避免了我们都犯错的循环边界条件(我也是)
  • It will run faster 它将运行得更快

It would be interesting to know how many rows you are dealing with. 知道您要处理多少行会很有趣。

I would be inclined to create a new table with the tokens as required, do some comparisons between the old and the new and then drop the old and rename the new table. 我倾向于根据需要使用令牌创建新表,在新旧表之间进行一些比较,然后删除旧表并重命名新表。 This may also be faster than doing lots of updates. 这也可能比进行大量更新要快。 If you do use the update approach make sure to vacuum afterwards. 如果您确实使用更新方法,请确保随后进行清理。

My test data is: 我的测试数据是:

create table news
(
    id        int,
    token     varchar(32)
);

insert into news (id, token) values (1, '187-2026-00130740 167');
insert into news (id, token) values (2, '18-2026-00130740 167');
insert into news (id, token) values (3, '1-2026-00130740 167');
insert into news (id, token) values (4, '-2026-00130740 167');
insert into news (id, token) values (5, '2026-00130740 167');
insert into news (id, token) values (6, '187-2026-00130740 16');
insert into news (id, token) values (7, '187-2026-00130740 1');
insert into news (id, token) values (8, '187-2026-00130740 ');
insert into news (id, token) values (9, '187-2026-00130740');

and the results are: 结果是:

 id |     token      
----+----------------
  9 | 2026-00130740
  5 | 2026-00130740 
  1 | 2026-00130740 
  2 | 2026-00130740 
  3 | 2026-00130740 
  4 | 2026-00130740 
  6 | 2026-00130740 
  7 | 2026-00130740 
  8 | 2026-00130740 
(9 rows)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM