简体   繁体   English

我怎样才能加速我的正则表达式?

[英]How can I speed up my regex?

I'm writing a script to change all the urls of my content over to a new place. 我正在编写一个脚本来将我的内容的所有网址更改为新的位置。

var regex = /.*cloudfront.net/
var pDistro = "newDistro.cloudfront.net/"

for(var i=0;i<strings.length;i++){
    strings[i] = strings[i].replace(regex,pDistro);
}

The strings I'm doing replace on average about 140 characters each. 我正在做的字符串平均每个replace大约140个字符。 They're urls that follow the format: https://[thing to replace].cloudfront.net/[something]/[something]/[something] 他们的网址遵循以下格式: https://[thing to replace].cloudfront.net/[something]/[something]/[something]

But this operation is terribly slow, taking about 4.5 seconds to process an average-sized array. 但是这个操作非常慢,大约需要4.5秒来处理一个平均大小的阵列。

Why is this so slow? 为什么这么慢? How can I make this faster? 我怎样才能让它更快?

If this question would be better suited to the codereview stack exchange, or some other site, let me know and I'll move it there. 如果这个问题更适合代码回放堆栈交换或其他一些网站,请告诉我,我会把它移到那里。

EDIT: 编辑:

The data, as it appeared in the db I was pulling from appeared to be 140 characters. 我在数据库中出现的数据似乎是140个字符。 During the pull process, some virtualization happened and appended 400 more characters onto the string, so no wonder the regex takes so long. 在拉取过程中,发生了一些虚拟化并在字符串上添加了400多个字符,因此难怪正则表达式需要这么长时间。

The 140-character-string loop takes considerably less time, as others have pointed out. 正如其他人所指出的那样,140字符串循环所需的时间要少得多。

The moral of the story: "Make sure the data you have is what you expect it to be" and "If your regex is taking too long, use smaller strings and a more specific regex (ie no wildcard)" 故事的寓意:“确保你拥有的数据是你所期望的”和“如果你的正则表达式花了太长时间,使用更小的字符串和更具体的正则表达式(即没有通配符)”

Perhaps it would run a little faster like this: 也许它会像这样运行得快一点:

https:\/\/[a-zA-Z0-9]+\.cloudfront\.net

Generally, the more exclusive your character sets are the faster the regular expression will run. 通常,您的字符集越独特,正则表达式运行得越快。


Thanks to @sbedulin for providing a jsperf link 感谢@sbedulin提供jsperf链接

For such a simple replacement, a regex is likely not the fastest search and replace. 对于这种简单的替换,正则表达式可能不是最快的搜索和替换。 For example, if you replace the search with .indexOf() and then use .slice() to do the replacement, you can speed it up 12-50x (depending upon browser). 例如,如果用.indexOf()替换搜索,然后使用.slice()进行替换,则可以将其加速12-50倍(取决于浏览器)。

I wasn't sure of the exact replacement logic you want to simulate, but here's a non-regex method that is a lot faster: 我不确定你想要模拟的确切替换逻辑,但是这里的非正则表达方法要快得多:

var pos, str, target = "cloudfront.net/";
var pDistro = "https://newDistro.cloudfront.net/"
for(var i = 0; i < urls.length; i++){
    str = urls[i];
    pos = str.indexOf(target);
    if (pos !== -1) {
        results[i] = pDistro + str.slice(pos + target.length);
    }
}

Adding in the more intelligent regex replacement suggested by others, here's a comparison. 添加其他人建议的更智能的正则表达式替换,这是一个比较。 The more intelligent regex definitely helps the regex, but it is still slower than just using .indexOf() and .slice() and the difference is the most pronounced in Firefox: 更智能的正则表达式肯定有助于正则表达式,但它仍然比仅使用.indexOf().slice()更慢,而且差异是Firefox中最明显的:

See jsperf here: http://jsperf.com/fast-replacer 请参阅jsperf: http ://jsperf.com/fast-replacer

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM