如何在记事本++中删除除html标签和此HTML标签的内容以外的所有内容？

Question

I open an HTML page in Notepad++. 我在Notepad ++中打开HTML页面。

The html page has a lot of things, but especially this tag: html页面有很多东西，但是特别是这个标记：

<div id="issue_content">CONTENT</div>

I'd like to remove everything from the html file except this tag and its content : 我想从html文件中删除除此标记及其内容以外的所有内容：

<div id="issue_content">CONTENT</div>

Example of file: 文件示例：

<p>ewrfefsd</p>
<div id="issue_content">CONTENT</div>
<p>ewrfefsd</p>
</html>

After deleting, the contents of the file should look like this: 删除后，文件内容应如下所示：

<div id="issue_content">CONTENT</div>

I try to use regular expression: (<div id=\\"issue_content\\">)(.*?)(<\\/div>)(.*?) 我尝试使用正则表达式： (<div id=\\"issue_content\\">)(.*?)(<\\/div>)(.*?)
, but this regular expression remove only tag <div id="issue_content">CONTENT</div> and content of this tag ，但此正则表达式仅删除标签<div id="issue_content">CONTENT</div>和此标签的内容

Answer 1

You can change your Regex to the following: The idea is that it matches everything, but creates a Match Group , containing the string you want, that you can use to replace everything with your Group : 您可以将Regex更改为以下内容：想法是，它匹配所有内容，但创建一个Match Group ，其中包含所需的字符串，可用于将所有内容替换为Group ：

This is the regex: 这是正则表达式：

/[\s\S]*?(<div id=\"issue_content\">[^>]+>)[\s\S]+/

It matches everything at start upto the string, you want, then it creates a Group with your string, and finally matches everything after that. 它在开始时将所有内容匹配到所需的字符串，然后使用您的字符串创建一个Group，最后匹配之后的所有内容。

When replacing, you replace with Group 1: 替换时，将替换为组1：

$1

Now you only have your string. 现在只有字符串了。

Answer 2

Try this, where $str is your HTML content variable. 试试看，其中$str是您的HTML内容变量。

preg_match('/<div id="issue_content">(.*)<\/div>/i', $str, $matches);

echo $matches[1];

Answer 3

This regex should do what you want. 这个正则表达式应该做你想要的。 Make sure you check the . matches newline 确保您检查了. matches newline . matches newline box on the Replace tab, and position the cursor at the beginning of the document. . matches newline “ Replace选项卡上的. matches newline框，并将光标定位在文档的开头。

^.*?(<div[^>]*id="issue_content">.*?<\/div>).*$

Replace with \\1 . 替换为\\1 。

Note that this code will only work if there are no other <div> tags nested within the one you are looking for. 请注意，只有在您要查找的标签中没有嵌套其他<div>标签时，此代码才有效。

如何在记事本++中删除除html标签和此HTML标签的内容以外的所有内容？

问题描述

3 个解决方案

解决方案1
0 2018-10-11 00:15:33

解决方案2
0 2018-10-11 00:25:49

解决方案3
0 已采纳 2018-10-11 04:47:58

如何在记事本++中删除除html标签和此HTML标签的内容以外的所有内容？

问题描述

3 个解决方案

解决方案1 0 2018-10-11 00:15:33

解决方案2 0 2018-10-11 00:25:49

解决方案3 0 已采纳 2018-10-11 04:47:58

解决方案1
0 2018-10-11 00:15:33

解决方案2
0 2018-10-11 00:25:49

解决方案3
0 已采纳 2018-10-11 04:47:58