删除/剥离特定的HTML标签，并使用NotePad ++替换

Question

Here is my text: 这是我的文字：

<h3>#6</h2>
Is he eating a lemon?

</div>

I have a few of them in my articles the #number is always different also the text is always different. 我的文章中有一些#number总是不同，文本也总是不同。

I want to make this out of it: 我想用它来做：

<h3>#6 Is he eating a lemon?</h3>

I tried it via regex in notepad++ but I am still very new to this: 我在记事本++中通过正则表达式尝试过，但是对此我还是很陌生：

My Search: 我的搜索：

<h3>.*?</h2>\r\n.*?\r\n\r\n</div>

Also see here . 另请参阅此处。

Now it is always selecting the the right part of the text. 现在，它总是选择文本的右侧。

How does my replace command need to look like now to get an output like above? 我的replace命令现在看起来应该如何获得上述输出？

Answer 1

You should modify your original regex to capture the text you want in groups, like this: 您应该修改原始的正则表达式以成组地捕获所需的文本，如下所示：

<h3>(.*?)</h2>\r\n(.*?)\r\n\r\n</div>
    (   )         (   ) 
//  ^             ^     These are your capture groups

You can then access these groups with the \\1 and \\2 tokens respectively. 然后，您可以分别使用\\1和\\2令牌访问这些组。

So your replace pattern would look like: 因此，您的替换模式如下所示：

<h3>\1 \2</h3>

Answer 2

您的搜索可能是<h3>(.*)<\\/h2>\\r\\n(.*)\\r\\n\\r\\n<\\/div> ，替换为<h3>$1 $2</h3> ，其中$ 1和$ 2代表括号中捕获的字符串。