简体   繁体   English

java.util.regex.Pattern 匹配器导致 CPU 使用率高

[英]java.util.regex.Pattern matcher causes high CPU usage

We are having issue with a regex validation using Pattern.我们在使用 Pattern 进行正则表达式验证时遇到问题。 It is happening in Spring framework and hibernate's validation.它发生在 Spring 框架和 hibernate 的验证中。

Below snippet shows the request object being validated:下面的片段显示了正在验证的请求 object:

@PostMapping
public ResponseEntity create(@RequestBody RequestObj request) {
  validationService.validate(request);
  .....

}

Regex pattern:正则表达式模式:

public class RequestObj {

  @Pattern(regexp = "^([a-zA-Z])+[-.'\\s]?[-a-zA-Z]*$", message = ValidationConstant.ERR_INVALID_FIRST_NAME)
  @NotNull(message = ValidationConstant.ERR_FIRST_NAME_EMPTY)
  @Size(max = 30, message = ValidationConstant.ERR_INVALID_NAME_SIZE)
  private String firstName;

  @Pattern(regexp = "^[\\sa-zA-Z0-9]+([ a-zA-Z0-9,'.?!\\-_&]+)*$", message = ValidationConstant.ERR_INVALID_COMMENT)
  @Size(max = 200, message = ValidationConstant.ERR_INVALID_COMMENT_SIZE)
  private String comment;

}

When this validation is called, at times the CPU usage of the thread shows 100%.(It works most of the time).调用此验证时,线程的 CPU 使用率有时会显示 100%。(大部分时间都有效)。 The thread dump shows that thread is stuck in Pattern class.线程转储显示线程卡在模式 class 中。

"http-nio-8080-exec-4" #53 daemon prio=5 os_prio=0 tid=0x00007fce45f0d000 nid=0x44 runnable [0x00007fcdb3af6000]
   java.lang.Thread.State: RUNNABLE
        at java.util.regex.Pattern$5.isSatisfiedBy(Pattern.java:5265)
        at java.util.regex.Pattern$5.isSatisfiedBy(Pattern.java:5265)
        at java.util.regex.Pattern$5.isSatisfiedBy(Pattern.java:5265)
        at java.util.regex.Pattern$CharProperty.match(Pattern.java:3790)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4274)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.match(Pattern.java:4799)
        at java.util.regex.Pattern$GroupTail.match(Pattern.java:4731)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$GroupHead.match(Pattern.java:4672)
        at java.util.regex.Pattern$Loop.matchInit(Pattern.java:4818)
        at java.util.regex.Pattern$Prolog.match(Pattern.java:4755)
        at java.util.regex.Pattern$Curly.match0(Pattern.java:4286)
        at java.util.regex.Pattern$Curly.match(Pattern.java:4248)
        at java.util.regex.Pattern$Begin.match(Pattern.java:3539)
        at java.util.regex.Matcher.match(Matcher.java:1270)
        at java.util.regex.Matcher.matches(Matcher.java:604)
        at org.hibernate.validator.internal.constraintvalidators.bv.PatternValidator.isValid(PatternValidator.java:60)
        at org.hibernate.validator.internal.constraintvalidators.bv.PatternValidator.isValid(PatternValidator.java:24)
        at org.hibernate.validator.internal.engine.constraintvalidation.ConstraintTree.validateSingleConstraint(ConstraintTree.java:171)
        at org.hibernate.validator.internal.engine.constraintvalidation.SimpleConstraintTree.validateConstraints(SimpleConstraintTree.java:68)
        at org.hibernate.validator.internal.engine.constraintvalidation.ConstraintTree.validateConstraints(ConstraintTree.java:73)
        at org.hibernate.validator.internal.metadata.core.MetaConstraint.doValidateConstraint(MetaConstraint.java:127)
        at org.hibernate.validator.internal.metadata.core.MetaConstraint.validateConstraint(MetaConstraint.java:120)
        at org.hibernate.validator.internal.engine.ValidatorImpl.validateMetaConstraint(ValidatorImpl.java:533)
        at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraintsForSingleDefaultGroupElement(ValidatorImpl.java:496)
        at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraintsForDefaultGroup(ValidatorImpl.java:465)
        at org.hibernate.validator.internal.engine.ValidatorImpl.validateConstraintsForCurrentGroup(ValidatorImpl.java:430)
        at org.hibernate.validator.internal.engine.ValidatorImpl.validateInContext(ValidatorImpl.java:380)
        at org.hibernate.validator.internal.engine.ValidatorImpl.validateCascadedAnnotatedObjectForCurrentGroup(ValidatorImpl.java:605)
        at org.hibernate.validator.internal.engine.ValidatorImpl.validateCascadedConstraints(ValidatorImpl.java:568)
        at org.hibernate.validator.internal.engine.ValidatorImpl.validateInContext(ValidatorImpl.java:389)
        at org.hibernate.validator.internal.engine.ValidatorImpl.validate(ValidatorImpl.java:169)

Is there any issue in my regex?我的正则表达式有什么问题吗?

The regex for first name validation which supports alphabets and few special chars ^([a-zA-Z])+[-.'\\s]?[-a-zA-Z]*$名字验证的正则表达式,支持字母和一些特殊字符^([a-zA-Z])+[-.'\\s]?[-a-zA-Z]*$

Regex for message box text ^[\sa-zA-Z0-9]+([ a-zA-Z0-9,'.?!\-_&]+)*$消息框文本的正则表达式^[\sa-zA-Z0-9]+([ a-zA-Z0-9,'.?!\-_&]+)*$

Do you actually use the first group?你真的使用第一组吗? ([a-zA-Z])

I don't think so, because otherwise you would have found out that it does not get filled with the letters up to the first non-letter character.我不这么认为,因为否则你会发现它不会被第一个非字母字符之前的字母填充。

You probably want to put the + sign into the group:您可能想将+号放入组中:

^([a-zA-Z]+)[-.'\\s]?[-a-zA-Z]*$

or do not use a group at all, if you don't need that part as group (I think that it is probably not used in your annotation):或者根本不使用组,如果您不需要该部分作为组(我认为它可能未在您的注释中使用):

^[a-zA-Z]+[-.'\\s]?[-a-zA-Z]*$

The effect of repetitions in the form of backtracking makes regex unexpectedly costly.回溯形式的重复效果使正则表达式的成本出乎意料。

In this case an optional interpunction between letters may take long, as in the empty case it might happen at any position.在这种情况下,字母之间的可选插入可能需要很长时间,因为在空的情况下它可能发生在任何 position。

Instead of代替

"^[a-zA-Z]+[-.'\\s]?[-a-zA-Z]*$"

try尝试

"^[a-zA-Z]+([-.'\\s][-a-zA-Z]*)?$"

This will enter the part starting with interpunction only when there is a match.这样只有在有匹配的情况下才会进入以句点开头的部分。

In general do a microbenchmark (with a benchmark library), as things might not be so clear.通常做一个微基准测试(使用基准库),因为事情可能不太清楚。

However regex will remain costly.然而,正则表达式仍然很昂贵。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 正则表达式疯狂:java.util.regex.Pattern matcher进入高CPU循环 - Regex gone wild: java.util.regex.Pattern matcher goes into high CPU loop java.util.regex.Pattern和java.util.regex.Matcher的设计有什么好处? - What is benefit in design of java.util.regex.Pattern and java.util.regex.Matcher? 正则表达式适用于java.util.regex.Pattern但不适用于com.oroinc.text.regex.Perl5Matcher - Regular expression works with java.util.regex.Pattern but not com.oroinc.text.regex.Perl5Matcher java.util.regex.Pattern的正则表达式 - Regular expression for java.util.regex.Pattern 添加到java.util.regex.Pattern或合并它们 - Adding onto a java.util.regex.Pattern or Merging Them Java反射:在JRuby中确定java.util.regex.Pattern的类 - Java Reflection: Determining class of java.util.regex.Pattern in JRuby 使用java.util.regex.Pattern在java中查找类似的IP - find similar IP in java using java.util.regex.Pattern Liferay中java.util.regex.Pattern处的java.lang.StackOverflowError - java.lang.StackOverflowError at at java.util.regex.Pattern in liferay (java.util.regex.Pattern的问题)检查字符串的数字和字母 - (Problems with java.util.regex.Pattern) checking string for digits and letters java.util.regex.Pattern 可以进行部分匹配吗? - Can java.util.regex.Pattern do partial matches?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM