regexp_replace字符串

Question

I'm using regexp_replace to standardize mailing addresses and I've encountered a situation I'm having trouble with. 我正在使用regexp_replace标准化邮件地址，我遇到了一个我遇到麻烦的情况。

Consider the following two addresses and what their result should be: 考虑以下两个地址以及它们的结果：

115 1/2 East 6th St -> 115 1/2 E 6th St 115 1/2 East 6th St - > 115 1/2 E 6th St.
818 East St -> 818 East St 东街818号 - >东街818号

In the second address, "East" is the actual name of the street, not a directional indicator. 在第二个地址中，“East”是街道的实际名称，而不是方向指示符。

For my query, I've attempted 对于我的询问，我试过了

SELECT
     regexp_replace(address, 'East[^ St]', 'E ')

but this fails to convert the first address to it's proper format. 但是这无法将第一个地址转换为正确的格式。

How can I write my regexp_replace such that the word East is converted to an 'E' in the first address, but leaves the word intact in the second address? 如何编写我的regexp_replace，以便在第一个地址中将单词East转换为'E'，但在第二个地址中保留单词？

Answer 1

Your current pattern matches the literal text East followed by any single character that isn't space, S , or t . 您当前的模式与文字文本East匹配，后跟任何不是空格， S或t单个字符。 I'm assuming you probably meant to use a negative lookahead to make sure that "East" doesn't come before " St", but sadly Oracle doesn't support negative lookaheads. 我假设您可能打算使用负向前瞻来确保“东”不会出现在“St”之前，但遗憾的是Oracle不支持负面前瞻。 Instead, you'll need to make the REGEXP_REPLACE conditional: 相反，您需要使REGEXP_REPLACE成为条件：

CASE
    WHEN address LIKE '%East%' AND address NOT LIKE '%East St%'
        THEN REGEXP_REPLACE(address, your_pattern, your_replacement)
    ELSE address
END

Answer 2

This answers your question with REGEXP_REPLACE(). 这用REGEXP_REPLACE（）回答了你的问题。 It looks for the string ' EAST' (don't want to catch the case where 'east' is the end of another word) followed by a space, one or more characters, another space and the string 'St' which is remembered in a group. 它查找字符串' EAST' （不想捕获'east'是另一个单词的结尾的情况）后跟一个空格，一个或多个字符，另一个空格和字符串'St' ，它被记住在一个小组。 If found, replace it with ' E' followed by the second remembered group (the space followed by the one or more characters followed by the space and 'St' . This is needed as they are 'consumed' by the regex engine as it moves left to right analyzing the string so you need to put them back. Note I added a bunch of different test formats (always test for the unexpected too!): 如果找到，请将其替换为' E'然后替换为第二个记忆组（空格后跟一个或多个字符，后跟空格和'St' 。这是必需的，因为它们正在被正则表达式引擎“消耗”从左到右分析字符串，所以你需要把它们放回去。注意我添加了一堆不同的测试格式（总是测试意外！）：

SQL> with tbl(address) as (
     select '115 1/2 East 6th St'       from dual union
     select '115 1/2 NorthEast 6th St'  from dual union
     select '115 1/2 East 146th St'     from dual union
     select '115 1/2 East North 1st St' from dual union
     select '818 East Ave'              from dual union
     select '818 Woodward'              from dual union
     select '818 East St'               from dual
   )
   select regexp_replace(address, '( East)( .+ St)', ' E\2') new_addr
   from tbl;

NEW_ADDR
------------------------------------------------------------------------
115 1/2 E 146th St
115 1/2 E 6th St
115 1/2 E North 1st St
115 1/2 NorthEast 6th St
818 East Ave
818 East St
818 Woodward

7 rows selected.

regexp_replace字符串

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-02-24 17:36:57

解决方案2
1 2016-02-24 22:51:09

regexp_replace字符串

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-02-24 17:36:57

解决方案2 1 2016-02-24 22:51:09

解决方案1
2 已采纳 2016-02-24 17:36:57

解决方案2
1 2016-02-24 22:51:09