I have a column in a table similar to the following.
tags |
---|
Large AcreageOcean IslandTurn KeyIncome Potential |
Ocean IslandSeasonalMainland Lot |
Lake IslandSeasonalTurn KeyIncome Potential |
I need to split the strings in the table so that it looks like this
tags |
---|
Large Acreage,Ocean Island,Turn Key,Income Potential |
Ocean Island,Seasonal,Mainland Lot |
Lake Island,Seasonal,Turn Key,Income Potential |
I thought a regex command like re.sub(r'([az][AZ])', ',', <string>)
could work but that code results in
'Large Acreag,cean Islan,urn Ke,ncome Potential'
Any advice?
Use two capturing groups in the expression and two backreferences in replacement:
re.sub(r'([a-z])([A-Z])', r'\1,\2', <string>)
If it is in Pandas:
df['tags'] = df['tags'].str.replace(r'([a-z])([A-Z])', r'\1,\2', regex=True)
See regex proof .
EXPLANATION
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[a-z] any character of: 'a' to 'z'
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
) end of \2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.