[英]How to improve the regex performance in java
I have this code to convert the whole text that is before "=" to uppercase 我有这段代码将“ =”之前的整个文本转换为大写
Matcher m = Pattern.compile("((?:^|\n).*?=)").matcher(conteudo);
while (m.find()) {
conteudo = conteudo.replaceFirst(m.group(1), m.group(1).toUpperCase());
}
But when the string is too large, it becomes very slow, I want to find a faster way to do that. 但是,当字符串太大时,它会变得很慢,我想找到一种更快的方法。
Any sugestions? 有任何建议吗?
EDIT 编辑
I haven't explained right. 我没有解释正确。 I have a text like this
我有这样的文字
field=value
field2=value2
field3=value3
And I want to convert each line like this 我想这样转换每一行
FIELD=value
FIELD2=value2
FIELD3=value3
The fastest way to get regex to work fast is to not use regex. 使正则表达式快速运行的最快方法是不使用正则表达式。 Regex was never meant to be and almost never is a good choice for performance-sensitive operations.
对于性能敏感的操作,Regex从来都不是,而且几乎从来都不是一个好的选择。 (Further reading: Why are regular expressions so controversial? )
(进一步阅读: 为什么正则表达式这么有争议? )
Try using String class methods instead, or write a custom method doing what you want. 尝试改用String类方法,或编写所需的自定义方法。 Use a tokenizer with split on '=', and then use
.toUpperCase()
on the tailing part (what's after \\n
). 使用在'='上分割的标记生成器,然后在
.toUpperCase()
使用.toUpperCase()
(在\\n
)。 Alternatively, just convert to char[]
or use charAt()
and traverse it manually, switching chars to upper after a newline and back to regular way after '='. 或者,只需将其转换为
char[]
或使用charAt()
并手动遍历它,就可以在换行符之后将chars切换为upper,并在'='之后将其切换为常规方式。
For example: 例如:
public static String changeCase( String s ) {
boolean capitalize = true;
int len = s.length();
char[] output = new char[len];
for( int i = 0; i < len; i++ ) {
char input = s.charAt(i);
if ( input == '\n' ) {
capitalize = true;
output[i] = input;
} else if ( input == '=' ) {
capitalize = false;
output[i] = input;
} else {
output[i] = capitalize ? Character.toUpperCase(input) : input;
}
}
return new String(output);
}
Method input: 方法输入:
field=value\n
field2=value2\n
field3=value3
Method output: 方法输出:
FIELD=value\n
FIELD2=value2\n
FIELD3=value3
Try it here: http://ideone.com/k0p67j 在这里尝试: http : //ideone.com/k0p67j
PS (by Jamie Zawinski): PS(杰米·扎温斯基着):
Some people, when confronted with a problem, think "I know, I'll use regular expressions."
有些人在遇到问题时会认为“我知道,我会使用正则表达式”。 Now they have two problems.
现在他们有两个问题。
With a multiline regex we can simply get every line separately and replace it :) 使用多行正则表达式,我们可以简单地单独获取每行并替换它:)
String conteudo = "field=value\nfield2=value2\nfield3=value3";
Pattern pattern = Pattern.compile("^([^=]+=)(.*)$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher(conteudo);
StringBuffer result = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(result, matcher.group(1).toUpperCase() + matcher.group(2));
}
System.out.println(conteudo);
System.out.println(result.toString());
What about something like this? 那这样的东西呢? indexOf should be fast enough.
indexOf应该足够快。
int equalsIdx = conteudo.indexOf('=');
String result = conteudo.substring(0, equalsIdx).toUpperCase() + conteudo.substring(equalsIdx, conteudo.length());
((?:^|\n)[^=]*=)
尝试这个 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.