用于添加元标记（如果不存在）的正则表达式（notepad ++）

Question

I'm trying to come up with a regular expression that will be applied for potentially hundreds of files, as a find..replace in notepad++. 我正在尝试提出一个正则表达式，该表达式将适用于潜在的数百个文件，作为notepad ++中的find..replace。 It's going to be like an if..else. 就像if..else。

Here's what I want to do but as a regex: 这是我想做的事，但要使用正则表达式：

if title tag exists and <meta http-equiv="X-UA-Compatible" content="IE=edge" /> does not exist on the page, AND an iframe tag exists, then insert <meta http-equiv="X-UA-Compatible" content="IE=edge" /> right after the title tag. 如果标题标签存在并且页面上不存在<meta http-equiv="X-UA-Compatible" content="IE=edge" />并且存在iframe标签，则插入<meta http-equiv="X-UA-Compatible" content="IE=edge" />在标题标签之后。

Sample text: 示范文本：

<title>Some Title</title>
<meta name="description" content="Mydescription." />
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />

 ...

<iframe src="iframeresource"></iframe>

Regex I have thus far: 到目前为止，我有正则表达式：

(<title>.*<\s*\/title>).*?(?!<meta http-equiv="X-UA-Compatible" content="IE=edge"\s*\/>.*?<iframe)

It uses a negative lookahead. 它使用负前瞻。 I need something like a conditional negative lookahead but the ability to perform substitution, if and only if <meta http-equiv="X-UA-Compatible" content="IE=edge /> does not exist already. I'm not quite sure how to do this with straight regex. 我需要类似条件否定先行的条件，但是具有执行替换的能力，并且仅当<meta http-equiv="X-UA-Compatible" content="IE=edge />不存在。确定如何使用正则表达式来做到这一点。

Any ideas would be most appreciated. 任何想法将不胜感激。 Thank you. 谢谢。

Answer 1

HTML parsing is best done with a dedicated DOM parser. HTML解析最好用专用的DOM解析器完成。 A regex can only be used to fix a well-structured, consistent HTML code. 正则表达式只能用于修复结构良好，一致的HTML代码。

If this is the case, use 如果是这种情况，请使用

(?si)\A(?!.*?<meta\s+http-equiv="X-UA-Compatible"\s+content="IE=edge"\s*/>)(.*?<title>.*?</title>)(.*)

and replace with $1\\n<meta http-equiv="X-UA-Compatible" content="IE=edge" />$2\\n . 并替换为$1\\n<meta http-equiv="X-UA-Compatible" content="IE=edge" />$2\\n 。

(?si) enables . (?si)启用. to match linebreaks and makes the pattern case insensitive. 匹配换行符并使模式不区分大小写。 \\A matches the start of a file. \\A匹配文件的开头。 The (?!.*?<meta\\s+http-equiv="X-UA-Compatible"\\s+content=‌"IE=e‌dge"\\s*/>) fails the match if the meta tag pattern is matched. 如果元标记未成功，则(?!.*?<meta\\s+http-equiv="X-UA-Compatible"\\s+content=‌"IE=e‌dge"\\s*/>)模式匹配。 (.*?<‌title>.*?</title>) consumes and captures text up to and including the first title tag. (.*?<‌title>.*?</title>)消耗并捕获直到第一个title标签（包括第一个(.*?<‌title>.*?</title>)文本。 Then (.*) matches the rest of the document. 然后(.*)匹配文档的其余部分。

See the regex demo 见正则表达式演示

用于添加元标记（如果不存在）的正则表达式（notepad ++）

问题描述

1 个解决方案

解决方案1
1

用于添加元标记（如果不存在）的正则表达式（notepad ++）

问题描述

1 个解决方案

解决方案1 1

解决方案1
1