简体   繁体   中英

python: insert comma delimiter between words that are stuck together in a string with regex

I have a column in a table similar to the following.

tags
Large AcreageOcean IslandTurn KeyIncome Potential
Ocean IslandSeasonalMainland Lot
Lake IslandSeasonalTurn KeyIncome Potential

I need to split the strings in the table so that it looks like this

tags
Large Acreage,Ocean Island,Turn Key,Income Potential
Ocean Island,Seasonal,Mainland Lot
Lake Island,Seasonal,Turn Key,Income Potential

I thought a regex command like re.sub(r'([az][AZ])', ',', <string>) could work but that code results in

'Large Acreag,cean Islan,urn Ke,ncome Potential'

Any advice?

Use two capturing groups in the expression and two backreferences in replacement:

re.sub(r'([a-z])([A-Z])', r'\1,\2', <string>)

If it is in Pandas:

df['tags'] = df['tags'].str.replace(r'([a-z])([A-Z])', r'\1,\2', regex=True)

See regex proof .

EXPLANATION

--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [a-z]                    any character of: 'a' to 'z'
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    [A-Z]                    any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
  )                        end of \2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM