简体   繁体   中英

Python auto-detect and split regex groups

I have a regex pattern which is too long to type it here, but you can read it from here:

https://linksnappy.com/api/REGEX

I want to re.compile it straight away, but I am getting AssertionError and inability to compile more than 100 named groups.

I tried writing a pattern to split the above one, but it's way too difficult to make it work and not raise any exceptions from sre_*.py.

Is there a function which can automatically split capture groups/alternatives, similar to sre_parse, but make a list with the regex alternatives from the above pattern?

I copied the strings and compiled it in python3, but did not get AssertionError. The only skill I employed is to use the literal string '''regex'''.

I also pasted it in regex101 . It is also valid and gives very detailed explanations of all alternatives and capturing groups.

For python2, I did saw in the source code that the number of capturing groups is limited to 100. In this case, python3 is the best option. If python2 is required, you may have to separate/shorten the regular expressions or choose not to use it.

In the particular example you provided, you can change your regex into 5 independent ones because it consists of 127 alternatives . It has the pattern of (a|b|c|d|e|...), but each alternative contains capturing groups as well. Check the regex explanation link regex101 . Just make sure each regex has less than 100 capturing groups.

I hope it helps you solve the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM