Is there a way to insert aspace if it contains a uppercase letter (but not the first letter)?
For example, given "RegularExpression" I´d like to obtain "Regular Expression" .
I tried the following regex:
re.sub("[a-z]{1}[A-Z][a-z]{1}", " ","regularExpression")
Unfortunately, this deletes the matching pattern:
regula pression
I would prefer a regex solution, yet would be thankful for any working solution. Thanks!
In [1]: s = 'RegularExpression'
In [2]: answer = []
In [3]: breaks = [i for i,char in enumerate(s) if char.isupper()]
In [4]: breaks = breaks[1:]
In [5]: answer.append(s[:breaks[0]])
In [6]: for start,end in zip(breaks, breaks[1:]):
...: answer.append(s[start:end])
...:
In [7]: answer.append(s[breaks[-1]:])
In [8]: answer
Out[8]: ['Regular', 'Expression']
In [9]: print(' '.join(answer))
Regular Expression
You can do this with the following:
import re
s = "RegularExpression"
re.sub(r"([A-Z][a-z]+)([A-Z][a-z]+)", r"\1 \2", s)
which means "put a space between the first match group and the second match group", where the match groups are a cap followed by one or more non-caps.
Try using Lookbehind "(?<=[az])([AZ])"
Ex:
import re
s = "RegularExpression"
print(re.sub(r"(?<=[a-z])([A-Z])", r" \1", s))
Output:
Regular Expression
As I understand, when an uppercase letter is preceded by a lowercase letter you wish to insert a space between them. You can do that by using re.sub
to replace (zero-width) matches of the following regular expression with a space.
r'(?<=[a-z])(?=[A-Z])'
Regex demo < ¯\\ (ツ) /¯ > Python code
Note that the SUBSTITUTION box at the regex demo link contains one space.
Python's regex engine performs the following operations.
(?<=[a-z]) : use a positive lookbehind to assert that the match is preceded
by a lowercase letter
(?=[A-Z]) : use a positive lookahead to assert that the match is followed
by an uppercase letter
For the string 'RegularExpression'
the regex matches the location between the letters 'r'
and 'E'
(ie, a zero-width match).
IIUC, one way using re.findall
:
re.findall("[A-Z][a-z]+", "RegularExpression")
Output:
['Regular', 'Expression']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.