Postgres 函数将单词拆分为具有额外逻辑的数组

Question

我一直在玩 psql 并将名称拆分为数组，例如：

select string_to_array('joseph jones', ' ');
string_to_array 
-----------------
{joseph,jones}

这完全符合我的预期。

但是，我的数据集包含许多前面带有“o”的姓氏。

select string_to_array('joseph o carroll', ' ');
string_to_array 
-----------------
{joseph,o,carroll}

有什么办法可以添加一些额外的逻辑，以便如果一个词前面有一个“ o ”，那么它会被捆绑到下一个词中？

所以joseph o carroll会返回{joseph,o carroll}

Answer 1

通过使用正则表达式，我想我找到了一个解决方案：

select regexp_split_to_array('joseph o jones','(?<!o)(\\s+)');

Answer 2

你不能仅仅使用(?<!o)\\s+ ，试试它对romeo bones 。 由于名字以o结尾，表达式不匹配。

用

select regexp_split_to_array('joseph o jones','(?<!\yo)\s+');

解释

--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    \y                       the boundary between a word char (\w)
                             and something that is not a word char
--------------------------------------------------------------------------------
    o                        'o'
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))

Postgres 函数将单词拆分为具有额外逻辑的数组

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-10-21 11:24:33

解决方案2
1 2020-10-21 21:34:10

Postgres 函数将单词拆分为具有额外逻辑的数组

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-10-21 11:24:33

解决方案2 1 2020-10-21 21:34:10

解决方案1
1 已采纳 2020-10-21 11:24:33

解决方案2
1 2020-10-21 21:34:10