简体   繁体   English

正则表达式以突出显示 XML 值

[英]regex to highlight XML values

DISCLAIMER: I know that using regex on xml is risky and generally a bad idea, but I can only feed regex into my syntax highlighting engine, and I can't spend the ressources required to create a new system just for xml-based languages.免责声明:我知道在 xml 上使用正则表达式是有风险的,而且通常是一个坏主意,但我只能将正则表达式输入到我的语法高亮引擎中,而且我不能花费所需的资源来为基于 xml 的语言创建一个新系统。


So I'm trying to use regex to get the values inside XML tags, as such:所以我试图使用正则表达式来获取 XML 标签中的值,如下所示:

<LoremIpsum>I NEED THIS PART</LoremIpsum>

I thought this would be nice and easy, and I could just use (>.*<\\/) .我认为这会很好很容易,我可以使用(>.*<\\/) It works perfectly on any online regex tester, however, as soon as I try using it in .NET, it completely messes up, and I end up getting a completely unpredictable output.它在任何在线正则表达式测试器上都可以完美运行,但是,一旦我尝试在 .NET 中使用它,它就完全搞砸了,最终我得到了完全不可预测的输出。 What would be the correct way to do this, in one regex expression, considering I'm using .NETs System.Text.RegularExpressions ?考虑到我使用的是 .NETs System.Text.RegularExpressions ,在一个正则表达式中,这样做的正确方法是什么?

This is probably because .NET Regex are greedy.这可能是因为 .NET Regex 是贪婪的。 My suggestion would be to use non greedy .*?我的建议是使用非贪婪的.*? or [^<] instead of .[^<]而不是. :

(>.*?<\/)
(>[^<]*<\/)

Like that it can't move over a < .就像那样它不能移动<

You never define what it completely messed up means, but try doing this:你永远不会定义it completely messed up意味着什么,但尝试这样做:

(>.*?<\/)

The ? ? in .*?.*? makes it a non-greedy match.使其成为非贪婪匹配。 By default, regular expressions operators greedy meaning they will match as much as possible.默认情况下,正则表达式运算符贪婪意味着它们将尽可能匹配。 The non-greedy form matches as little as possible.非贪婪形式尽可能少地匹配。 To see the difference, match 'is test of' against both forms: With (>.*<\\/) you will match: is <a>test</a> of .要查看差异,请针对两种形式匹配 'is test of': 使用(>.*<\\/)您将匹配: is <a>test</a> of With (>.*?<\\/) you will match is <a>test .使用(>.*?<\\/)您将匹配的is <a>test

If you want to avoid any XML tags in the match, then you should use @ThomasWeller's solution.如果您想避免匹配中的任何 XML 标签,那么您应该使用@ThomasWeller 的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM