I asked a similar question about a week ago, and tried to mess with that code to suit a different purpose, but couldn't seem to make it work.
I want to split a string using month abbreviations as the delimiters (so, I'd have JAN, FEB, MAR, APR, MAY, JUNE, etc)
I tried using
df['a'] = [re.split(r'[JUNE|JULY]+', x) for x in df['a']
as well as some variations on that (adding .group(0)
before for x
)
I'm guessing my problem is syntax with the delimiters. Looking at the documentation for regular expressions, I should be able to use strings as delimiters, but can only find a way to do it using re.search
.
Have also tried
df['a'] = [re.split[(('JUNE', 'JULY'), x).group(0) for x in df['a']]
the data in the series is something like this:
df['a'] = ['ABCJUNE123', 'DEFJULY456', 'DEGJUNE765', 'DEFJUNE345']
and I want:
df['a'] = ['ABC', 'DEF', 'DEG', 'DEF']
What am I missing from my expression?
Your regex would be,
r'JUNE|JULY'
Example:
>>> re.split(r'JUNE|JULY', 'ABCJUNE123')
['ABC', '123']
[JUNE|JULY]+
regex doesn't represent JUNE
or JULY
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.