简体   繁体   English

正则表达式替换文本

[英]Regular expression to replace text

I am very new to regular expressions.我对正则表达式很陌生。 I am using UltraEdit, and would like to use regular expressions to make the changes described below.我正在使用 UltraEdit,并希望使用正则表达式进行下面描述的更改。

I have some text in the following pattern:我有以下模式的一些文本:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="000760.rock" title="222"/>
    <Music format="ditamap" href="000756.rock" title="333"/>
</Music>

I need to add prefix 'Z' in front of href with extension .rock .我需要在扩展名为.rockhref前面添加前缀“Z”。

href="000760.rock" --> href="Z000760.rock"

The output should look like this:输出应如下所示:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="Z000760.rock" title="222"/>
    <Music format="ditamap" href="Z000756.rock" title="333"/>
</Music>

What would be the regular expression to do this in UltraEdit?在 UltraEdit 中执行此操作的正则表达式是什么?

Re-wrote my answer to重新写了我的答案

  1. Add new use-case OP added where some values have the X prefix and must not be replaced.添加新的用例 OP,其中某些值具有 X 前缀且不得替换。
  2. I was initially putting the double quote character in brackets when there was no need.当不需要时,我最初将双引号字符放在括号中。

The first case I answered is where none of the HREF values already have the X prefix.我回答的第一种情况是没有一个 HREF 值已经具有 X 前缀。

Find:找:

href="([^"]*)\.rock"

And replace:并替换:

href="X\1.rock"

Start:开始:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="000760.rock" title="222"/>
    <Music format="ditamap" href="000756.rock" title="333"/>
</Music>

Finish:结束:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="X000760.rock" title="222"/>
    <Music format="ditamap" href="X000756.rock" title="333"/>
</Music>

Screen shot showing this first result is below.显示第一个结果的屏幕截图如下。

显示第一个结果的屏幕截图。

Breakdown of the regex:正则表达式分解:

  1. Find: href="([^"]*)\\.rock"查找: href="([^"]*)\\.rock"
    1. href=" - this finds href=" href=" - 这找到了href="
    2. ([^"]*) - this creates the first backreference - tells the engine to look for and remember everything between the brackets: [^"]* so that we can reference it in the replace part. ([^"]*) - 这将创建第一个反向引用 - 告诉引擎查找并记住括号之间的所有内容: [^"]*以便我们可以在替换部分中引用它。
      1. [^"] - this part of the pattern says any character that is not a double quote. [^"] - 模式的这一部分表示不是双引号的任何字符。
      2. And the asterisk at the end of [^"]* is a repetition pattern that says look for zero or more characters that matches the thing just before it (so find zero or more characters that are not a double quote). [^"]*末尾的星号是一个重复模式,表示查找零个或多个与其前面的内容匹配的字符(因此找到零个或多个不是双引号的字符)。
    3. \\.rock" this defines the rest of the pattern which must be .rock" \\.rock"这定义了模式的其余部分必须是.rock"
    4. Note that I have escaped the period character: \\.请注意,我已经转义了句点字符: \\. . . That is because period has a special meaning in a regex and we are telling the regex that we mean a literal dot or period.那是因为句点在正则表达式中具有特殊含义,我们告诉正则表达式我们的意思是文字点或句点。
  2. Replace: href="X\\1.rock"替换: href="X\\1.rock"
    1. href="X - says to output literally href="X .. href="X - 表示从字面上输出href="X ..
    2. \\1 - says to replace \\1 with the first backreference we created (zero or more characters that are not a double quote). \\1 - 表示将\\1替换为我们创建的第一个反向引用(零个或多个不是双引号的字符)。
    3. .rock" - says to output literally .rock" . .rock" - 说.rock"字面意思输出.rock"
      1. Note that I didn't need to escape the period here, because it doesn't have the same meaning in replace - it just means the literal dot.请注意,我不需要在这里转义句点,因为它在替换中没有相同的含义 - 它只是表示文字点。

The second case is in response to OP's comment that some of the HREF values already have the X prefix.第二种情况是响应 OP 的评论,即某些 HREF 值已经具有 X 前缀。 In this case, change the regex as below.在这种情况下,请按如下方式更改正则表达式。

Find:找:

href="([^X][^"]*)\.rock"

And replace:并替换:

href="X\1.rock"

Start:开始:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="000760.rock" title="222"/>
    <Music format="ditamap" href="X000756.rock" title="333"/>
    <Music format="ditamap" href="000757.rock" title="444"/>
    <Music format="ditamap" href="X000758.rock" title="555"/>
    <Music format="ditamap" href="000759.rock" title="666"/>
</Music>

Finish:结束:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="X000760.rock" title="222"/>
    <Music format="ditamap" href="X000756.rock" title="333"/>
    <Music format="ditamap" href="X000757.rock" title="444"/>
    <Music format="ditamap" href="X000758.rock" title="555"/>
    <Music format="ditamap" href="X000759.rock" title="666"/>
</Music>

Screen shot showing this second result is below.显示第二个结果的屏幕截图如下。

显示第二个结果的屏幕截图。

Breakdown of the regex:正则表达式分解:

  1. Find: href="([^X][^"]*)\\.rock"查找: href="([^X][^"]*)\\.rock"
    1. href=" - this finds href=" href=" - 这找到了href="
    2. ([^X][^"]*) - this creates the first backreference - tells the engine to look for and remember everything between the brackets: ([^X][^"]*)* so that we can reference it in the replace part. ([^X][^"]*) - 这将创建第一个反向引用 - 告诉引擎查找并记住括号之间的所有内容: ([^X][^"]*)*以便我们可以在替换部分。
      1. [^X]* - this part of the pattern says any character that is not an X. [^X]* - 模式的这一部分表示任何不是 X 的字符。
      2. [^"] - this part of the pattern says any character that is not a double quote. [^"] - 模式的这一部分表示不是双引号的任何字符。
      3. And the asterisk at the end of [^"]* is a repetition pattern that says look for zero or more characters that matches the thing just before it (so find zero or more characters that are not a double quote). [^"]*末尾的星号是一个重复模式,表示查找零个或多个与其前面的内容匹配的字符(因此找到零个或多个不是双引号的字符)。
    3. \\.rock" this defines the rest of the pattern which must be .rock" \\.rock"这定义了模式的其余部分必须是.rock"
    4. Note that I have escaped the period character: \\.请注意,我已经转义了句点字符: \\. . . That is because period has a special meaning in a regex and we are telling the regex that we mean a literal dot or period.那是因为句点在正则表达式中具有特殊含义,我们告诉正则表达式我们的意思是文字点或句点。
  2. Replace: href="X\\1.rock"替换: href="X\\1.rock"
    1. href="X - says to output literally href="X .. href="X - 表示从字面上输出href="X ..
    2. \\1 - says to replace \\1 with the first backreference we created (zero or more characters that are not a double quote). \\1 - 表示将\\1替换为我们创建的第一个反向引用(零个或多个不是双引号的字符)。
    3. .rock" - says to output literally .rock" . .rock" - 说.rock"字面意思输出.rock"
      1. Note that I didn't need to escape the period here, because it doesn't have the same meaning in replace - it just means the literal dot.请注意,我不需要在这里转义句点,因为它在替换中没有相同的含义 - 它只是表示文字点。

I'm not sure for Ultraedit, but I assume it's close to notepad++:我不确定 Ultraedit,但我认为它接近记事本 ++:

Find what: (href=")(.+?\\.rock")找到什么: (href=")(.+?\\.rock")
Replace with: $1X$2替换为: $1X$2

X or Z as it's not clear in your question. XZ因为您的问题不清楚。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM