简体繁体 English

如何在.NET中实现正则表达式？

[英]How are regular expressions implemented in .NET?

原文 2009-07-10 18:06:13 0 1 c#/ .net/ regex/ performance/ grep

I have just read this interesting article about the implementation details for various languages that support regular expressions. 我刚刚阅读了这篇有趣的文章，关于支持正则表达式的各种语言的实现细节。

It describes an alternative implementation of regular expressions that uses non-deterministic finite automatons (NFAs) versus deterministic ones (DFAs). 它描述了使用非确定性有限自动机（NFA）与确定性有限自动机（DFA）的正则表达式的替代实现。 It claims that back-tracking DFA implementations (the version used in Perl, Java, and others) are susceptible to very slow performance on some particularly "pathological" regular expressions. 它声称反向跟踪DFA实现（Perl，Java和其他版本中使用的版本）在一些特别“病态”的正则表达式上容易受到非常慢的性能影响。 (grep, awk, and Tcl still use DFAs, but somehow are exponentially faster) （grep，awk和Tcl仍然使用DFA，但不知何故以指数方式加快）

It makes no reference to the .NET framework, but I would like to know how .NET (C# in particular) regular expressions are implemented, and how they compare in terms of performance. 它没有引用.NET框架，但我想知道如何实现.NET（特别是C＃）正则表达式，以及它们在性能方面的比较。

Edit: 编辑：

Can I assume since the answerer's article mentions .NET does backtracking, that it will be on par with Perl and Java? 我可以假设，因为回答者的文章提到.NET确实回溯，它将与Perl和Java相提并论吗？

1 个解决方案

There's an awesome write-up here . 这里有一个真棒写了这里。 He takes advantage of the fact that you can step in to the .NET framework code and see what it does, and explains how everything works. 他利用了这样一个事实：您可以介入.NET框架代码并查看它的功能，并解释一切是如何工作的。 It's an excellent read. 这是一个很好的阅读。