I am trying to split a string into specific keywords. I have a list of key words/characters.
for example: I have a list of keywords {'1', '2', '3', '4', '5', 'let', 'while'}
and I have a string let2while4
I want to output a list that contains {'let', '2', while', '4'}
Is this possible? I currently only have it split using a delimiter with ' '
Thank you!
EDIT: Using Gilch's answer below works for the example below, but when I put in my full keywords, I am getting these errors:
Traceback (most recent call last):
File "parser.py", line 14, in <module>
list = re.findall(f"({'|'.join(keywords)})", input)
File "/usr/lib/python3.7/re.py", line 223, in findall
File "/usr/lib/python3.7/sre_parse.py", line 816, in _parse
p = _parse_sub(source, state, sub_verbose, nested + 1)
File "/usr/lib/python3.7/sre_parse.py", line 426, in _parse_sub
not nested and not items))
File "/usr/lib/python3.7/sre_parse.py", line 651, in _parse
source.tell() - here + len(this))
re.error: nothing to repeat at position 17
My full keywords include:
keywords = {'1','2','3','4','5','6','7','8','9','0','x','y','z','+','-','*','>','(',')',';','$','let','while','else','='}
Use '|'.join()
to make a regex pattern from your keywords.
>>> keywords = {'1', '2', '3', '4', '5', 'let', 'while'}
>>> string = 'let2while4'
>>> import re
>>> re.findall('|'.join(keywords), string)
['let', '2', 'while', '4']
>>> set(_)
{'let', '2', 'while', '4'}
If your keywords might contain regex control characters, you can use re.escape()
on them before the join.
>>> re.findall('|'.join(map(re.escape, keywords)), string)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.