简体   繁体   中英

Incremental Pattern (RegEx) matching in Java?

Is there a way or an efficient library that allows for incremental regular expression matching in Java?

What I mean by that is, I would like to have an OutputStream that I can send a couple bytes at a time to and that keeps track of matching the data so far against a regular expression. If a byte is received that will cause this regular expression to definitely not match, I would like the stream to tell me so. Otherwise it should keep me informed about the current best match, if any.

I realize that this is likely to be an extremely difficult and not well defined problem, since one can imagine regular expressions that can match a whole expression or any part of it or not have a decision until the stream is closed anyways. Even something as trivial as .* can match H, He, Hel, Hell, Hello, and so forth. In such a case, I would like the stream to say: "Yes, this expression could match if it was over now, and here are the groups it would return."

But if Pattern internally steps through the string it matches character by character, it might not be so hard?

Incremental matching can be nicely achieved by computing the finite state automaton corresponding to a regular expression, and performing state transitions on that while processing the characters of the input. Most lexers work this way. This approach won't work well for groups , though.

So perhaps you could make this two parts: have one matcher which figures out whether there is any match at all, or any chance of a match in the future. You can use that to give you a quick reply after every input character. Once you have a complete match, you can exucte a backtracking and grouping regular expression engine to identify your matching groups. In some cases, it might be feasible to encode the grouping stuff into the automaton as well, but I can't think of a generic way to accomplish this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM