简体   繁体   English

我如何描述Perl正则表达式?

[英]How can I profile Perl regexes?

分析Perl正则表达式以确定它们有多贵的最佳方法是什么?

Perl comes with the Benchmark module, which can take a number of code samples, and answer the question of "which one is faster?". Perl附带了Benchmark模块,可以获取大量代码示例,并回答“哪一个更快?”的问题。 I've got a Perl Tip on Benchmarking Basics , and while that doesn't use regexps per se, it does give a quick and useful introduction to the topic, along with further references. 我在基准测试基础知识上有一个Perl提示 ,虽然它本身不使用正则表达式,但它确实提供了对该主题的快速有用的介绍,以及进一步的参考。

brian d foy also has an excellent chapter on benchmarking in his Mastering Perl book. brian d foy在他的Mastering Perl书中也有一个很好的基准测试章节。 He's been kind enough to put the chapter on-line as a draft , which is well worth the read. 他非常友好地将这一章作为草稿上线 ,这非常值得一读。 I really can't recommend it enough. 我真的不能推荐它。

Paul 保罗

Just saying "use the Benchmark" module doesn't really answer the question, though. 但是,只是说“使用基准”模块并没有真正回答这个问题。 Benchmarking a regex is different than benchmarking a calculation; 对正则表达式进行基准测试不同于对计算进行基准测试; you need a large amount of realistic data so you can stress the regex as real data would. 你需要大量的真实数据,所以你可以像真实数据那样强调正则表达式。 If most of your data will match, you'd want a regex that matches quickly; 如果您的大部分数据都匹配,那么您需要一个快速匹配的正则表达式; if most will fail, you want a regex that fails quickly. 如果大多数都会失败,你想要一个快速失败的正则表达式。 They could wind up being the same regex, but maybe not. 他们可能会成为同一个正则表达式,但也许不是。

My preferred way would be to have a large set of input data to the RE then process that data N times (eg, 100,000) to see how long it takes. 我首选的方法是向RE提供大量输入数据,然后处理该数据N次(例如100,000次)以查看需要多长时间。

Then tweak the RE and try again (keep all the old REs as comments in case you need to benchmark them again in future, who knows what wondrous optimizations may appear in Perl 7?). 然后再调整RE并重试(将所有旧的RE保留为注释,以防将来需要再次对它们进行基准测试,谁知道Perl 7中可能会出现什么奇妙的优化?)。

There may well be tools which can analyze REs to give you execution paths for specific inputs (like the analysis tools in DBMS') but, since Perl is the language of the lazy (a commandment handed down by Larry himself), I couldn't be bothered going to find it :-). 可能有一些工具可以分析RE来为特定输入提供执行路径(比如DBMS中的分析工具)但是,由于Perl是懒惰的语言(Larry自己传授的诫命),我不能被打扰去找它:-)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM