正则表达式，用于将html标签与转义字符进行匹配

Question

I'm using regex to help me filter out HTML in a localisation project that I don't want to translate. 我正在使用正则表达式来帮助我过滤掉我不想翻译的本地化项目中的HTML。 Normally I use </?\\w+((\\s+\\w+(\\s*=\\s*(?:".*?"|'.*?'|[^'">\\s]+))?)+\\s*|\\s*)/?> but the content I'm translating has escaped characters in the HTML such as 通常我使用</?\\w+((\\s+\\w+(\\s*=\\s*(?:".*?"|'.*?'|[^'">\\s]+))?)+\\s*|\\s*)/?>但我正在翻译的内容已转义HTML中的字符，例如

<a href\="http\://www.fau.de/studium/zulassung/einschreibung/" target\="_blank"     title\="Externer Link auf die Webseite der FAU">

Can some kind soul help me work out how to match html tags containing slashes where they shouldn't really be? 某种友善的灵魂可以帮助我确定如何匹配包含斜杠的html标签吗？

Answer 1

我使用“ /<(.|\\n)*?>/g”来匹配文本中的所有HTML标记，这对于我忽略了此内容非常有用。

正则表达式，用于将html标签与转义字符进行匹配

问题描述

1 个解决方案

解决方案1
0 2013-07-10 08:48:43

正则表达式，用于将html标签与转义字符进行匹配

问题描述

1 个解决方案

解决方案1 0 2013-07-10 08:48:43

解决方案1
0 2013-07-10 08:48:43