如何使用正则表达式替换Notepad ++中的标记之间的文本

Question

I have a code like this: 我有这样的代码：

<pre><code>Some <a href="">HTML</a> code</code></pre>

I need to escape the HTML between the <pre><code></code></pre> tags. 我需要在<pre><code></code></pre>标签之间转义HTML。 I have lots of tags, so I thought - why not let regex do it for me. 我有很多标签，所以我想 - 为什么不让正则表达式为我做。 The problem is I don't know how. 问题是我不知道怎么做。 I've seen lots of examples using Google and Stackoverflow, but nothing I could use. 我见过许多使用Google和Stackoverflow的例子，但我无法使用。 Can someone here help me? 有人可以帮帮我吗？

Example: 例：

<pre><code>Some <a href="http">HTML</a> code</code></pre>

To 至

<pre><code>Some &lt;a href=&quot;http&quot;&gt;HTML&lt;/a&gt; code</code></pre>

Or just a regex so I can replace anything between the <pre><code> and </code></pre> tags one by one. 或者只是一个正则表达式，所以我可以逐个替换<pre><code>和</code></pre>标签之间的任何内容。 I'm almost certain that this can be done. 我几乎可以肯定这可以做到。

Answer 1

A regular expression to return "the thing between <pre><code> and </code></pre> " could be 一个正则表达式可以返回“ <pre><code>和</code></pre> ”

/(?<=<pre><code>).*?(?=<\/code><\/pre>)/

This uses lookaround expressions to delimit the "thing that gets matched". 这使用环绕表达式来分隔“匹配的东西”。 Typically using regex in situations with nested tags is fraught with danger and you are much better off using "real tools" made specifically for the job of parsing xml, html etc. I am a huge fan of Beautiful Soup (Python) myself. 通常在嵌套标签的情况下使用正则表达式充满了危险，你最好使用专门为解析xml，html等工作而设计的“真正的工具”。我自己是Beautiful Soup（Python）的忠实粉丝。 Not familiar with Notepad++, so not sure if its dialect of regex matches this expression exactly. 不熟悉Notepad ++，所以不确定它的正则表达式是否与此表达式完全匹配。

Answer 2

This regex will match the parts of the anchor tag you need to put back: 此正则表达式将匹配您需要放回的锚标记的部分：

<pre><code>([^<]*?)<a href="(.*?)">(.*?)</a>(.*?)</code></pre>

See a live demo , which shows it matching correctly and also shows the various parts being captured as groups which we'll refer to in the replacement string (see below). 查看一个实时演示，它显示它正确匹配，并显示被捕获的各个部分，我们将在替换字符串中引用它们（见下文）。

Use the regex above with the following replacement: 使用上面的正则表达式进行以下替换：

<pre><code>\1&lt;a href=&quot;\2&quot;&gt;\3&lt;/a&gt;\4</pre></code>

The \\1 , \\2 etc are the captured groups in the regex that put back what we're keeping from the match. \\1 ， \\2等是正则表达式中捕获的组，它们将我们从匹配中保留的内容放回去。

如何使用正则表达式替换Notepad ++中的标记之间的文本

问题描述

2 个解决方案

解决方案1
1 2013-07-27 03:31:19

解决方案2
1 已采纳 2013-07-27 05:31:32

如何使用正则表达式替换Notepad ++中的标记之间的文本

问题描述

2 个解决方案

解决方案1 1 2013-07-27 03:31:19

解决方案2 1 已采纳 2013-07-27 05:31:32

解决方案1
1 2013-07-27 03:31:19

解决方案2
1 已采纳 2013-07-27 05:31:32