Say I was given a string like so
text = "1234 I just ? shut * the door"
I want to use a regex with re.compile() such that when I split the list all of the words are in front.
Ie it should look like this.
text = ["I", "just", "shut", "the", "door", "1234", "?", "*"]
How can I use re.compile() to split the string this way?
import re
r = re.compile('regex to split string so that words are first').split(text)
Please let me know if you need any more information.
Thank you for the help.
IIUC, you don't need re
. Just use str.split
with sorted
:
sorted(text.split(), key=lambda x: not x.isalpha())
Output:
['I', 'just', 'shut', 'the', 'door', '1234', '?', '*']
You can use sorted
with re.findall
:
import re
text = "1234 I just ? shut * the door"
r = sorted(text.split(), key=lambda x:(x.isalpha(), x.isdigit(), bool(re.findall('^\W+$', x))), reverse=True)
Output:
['I', 'just', 'shut', 'the', 'door', '1234', '?', '*']
You can't do that with a single regex. You can write one regex to get all words, then another regex to get everything else.
import re
text = "1234 I just ? shut * the door"
r = re.compile(r'[a-zA-Z]+')
words = r.findall(text)
r = re.compile(r'[^a-zA-Z\s]+')
other = r.findall(text)
print(words + other) # ['I', 'just', 'shut', 'the', 'door', '1234', '?', '*']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.