简体   繁体   中英

Multiple Pattern Match Algorithm

I have lot of logs and every record contains a url. And I have about 2000+ url patterns to filter the log. Some patterns are regular pattern with capturable group. I want to get url and the matched pattern and, if possible, the captured groupes. Is there a java lib can help me. Or any Algorithm which can solve my problem. Or anyting else which related to my problem. Thanks a lot.

Take a look at java regular expressions library ( link ).

You can construct a single large pattern by concatenating your original patterns with | between them (use () to specify that you don't want just 1 character).

The regular expression can be compiled into an efficient matching finite automata, that you can run over your data. Just make sure you compile it once and reuse it for every record.

It will handle extracting groups, but you need to handle the groups in a generic way (since any group can be matched). If it makes it easier consider using named groups to make handling simpler.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM