简体   繁体   中英

How to extract these sub-strings from a string with regex in python?

I'm building a module in python, that focuses mainly on mathematics. I thought it would be a nice touch to add support for mathematical series. I had no issues with implementing arithmetic progression and geometric series, but I stumbled upon a problem when attempting to implement recursive series. I've come up with a solution to that, but for that I first need to extract the elements of the series from a user-input string that represents the series.I think that regex might be the best option, but it is my biggest phobia in the world, so I'd really appreciate the help.

For example, for a string like

"a_n = a_{n-1} + a_{n-2}"

I want to have a set

{"a_n","a_{n-1}","a_{n-2}"}

It also needs to support more complicated recursive definitions, like:

"a_n*a_{n-1} = ln(a_{n-2} * a_n)*a_{n-3}"

the set will be:

{"a_n","a_{n-1}","a_{n-2}","a_{n-3}"}

Feel free to do some minor syntax changes if you think it'll make it easier for the task.

The regex is easy a_(?:n|{n-\\d})

  1. a_
  2. then
    • either n
    • or {n-\\d}
import re

ptn = re.compile(r"a_(?:n|{n-\d})")

print(set(ptn.findall("a_n = a_{n-1} + a_{n-2}")))
# {'a_{n-1}', 'a_n', 'a_{n-2}'}

print(set(ptn.findall("a_n*a_{n-1} = ln(a_{n-2} * a_n)*a_{n-3}")))
# {'a_{n-1}', 'a_{n-3}', 'a_n', 'a_{n-2}'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM