python: insert comma delimiter between words that are stuck together in a string with regex

Question

I have a column in a table similar to the following.

tags
Large AcreageOcean IslandTurn KeyIncome Potential
Ocean IslandSeasonalMainland Lot
Lake IslandSeasonalTurn KeyIncome Potential

I need to split the strings in the table so that it looks like this

tags
Large Acreage,Ocean Island,Turn Key,Income Potential
Ocean Island,Seasonal,Mainland Lot
Lake Island,Seasonal,Turn Key,Income Potential

I thought a regex command like re.sub(r'([az][AZ])', ',', <string>) could work but that code results in

'Large Acreag,cean Islan,urn Ke,ncome Potential'

Any advice?

Answer 1

Use two capturing groups in the expression and two backreferences in replacement:

re.sub(r'([a-z])([A-Z])', r'\1,\2', <string>)

If it is in Pandas:

df['tags'] = df['tags'].str.replace(r'([a-z])([A-Z])', r'\1,\2', regex=True)

See regex proof .

EXPLANATION

--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [a-z]                    any character of: 'a' to 'z'
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    [A-Z]                    any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
  )                        end of \2

python: insert comma delimiter between words that are stuck together in a string with regex

Question

1 answers

solution1
3 ACCPTED 2021-07-26 21:32:50

python: insert comma delimiter between words that are stuck together in a string with regex

Question

1 answers

solution1 3 ACCPTED 2021-07-26 21:32:50

solution1
3 ACCPTED 2021-07-26 21:32:50