[英]regexp_replace string
I'm using regexp_replace to standardize mailing addresses and I've encountered a situation I'm having trouble with. 我正在使用regexp_replace标准化邮件地址,我遇到了一个我遇到麻烦的情况。
Consider the following two addresses and what their result should be: 考虑以下两个地址以及它们的结果:
In the second address, "East" is the actual name of the street, not a directional indicator. 在第二个地址中,“East”是街道的实际名称,而不是方向指示符。
For my query, I've attempted 对于我的询问,我试过了
SELECT
regexp_replace(address, 'East[^ St]', 'E ')
but this fails to convert the first address to it's proper format. 但是这无法将第一个地址转换为正确的格式。
How can I write my regexp_replace such that the word East is converted to an 'E' in the first address, but leaves the word intact in the second address? 如何编写我的regexp_replace,以便在第一个地址中将单词East转换为'E',但在第二个地址中保留单词?
Your current pattern matches the literal text East
followed by any single character that isn't space, S
, or t
. 您当前的模式与文字文本
East
匹配,后跟任何不是空格, S
或t
单个字符。 I'm assuming you probably meant to use a negative lookahead to make sure that "East" doesn't come before " St", but sadly Oracle doesn't support negative lookaheads. 我假设您可能打算使用负向前瞻来确保“东”不会出现在“St”之前,但遗憾的是Oracle不支持负面前瞻。 Instead, you'll need to make the REGEXP_REPLACE conditional:
相反,您需要使REGEXP_REPLACE成为条件:
CASE
WHEN address LIKE '%East%' AND address NOT LIKE '%East St%'
THEN REGEXP_REPLACE(address, your_pattern, your_replacement)
ELSE address
END
This answers your question with REGEXP_REPLACE(). 这用REGEXP_REPLACE()回答了你的问题。 It looks for the string
' EAST'
(don't want to catch the case where 'east' is the end of another word) followed by a space, one or more characters, another space and the string 'St'
which is remembered in a group. 它查找字符串
' EAST'
(不想捕获'east'是另一个单词的结尾的情况)后跟一个空格,一个或多个字符,另一个空格和字符串'St'
,它被记住在一个小组。 If found, replace it with ' E'
followed by the second remembered group (the space followed by the one or more characters followed by the space and 'St'
. This is needed as they are 'consumed' by the regex engine as it moves left to right analyzing the string so you need to put them back. Note I added a bunch of different test formats (always test for the unexpected too!): 如果找到,请将其替换为
' E'
然后替换为第二个记忆组(空格后跟一个或多个字符,后跟空格和'St'
。这是必需的,因为它们正在被正则表达式引擎“消耗”从左到右分析字符串,所以你需要把它们放回去。注意我添加了一堆不同的测试格式(总是测试意外!):
SQL> with tbl(address) as (
select '115 1/2 East 6th St' from dual union
select '115 1/2 NorthEast 6th St' from dual union
select '115 1/2 East 146th St' from dual union
select '115 1/2 East North 1st St' from dual union
select '818 East Ave' from dual union
select '818 Woodward' from dual union
select '818 East St' from dual
)
select regexp_replace(address, '( East)( .+ St)', ' E\2') new_addr
from tbl;
NEW_ADDR
------------------------------------------------------------------------
115 1/2 E 146th St
115 1/2 E 6th St
115 1/2 E North 1st St
115 1/2 NorthEast 6th St
818 East Ave
818 East St
818 Woodward
7 rows selected.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.