简体   繁体   中英

python/pandas - substring replacement using regular expression

I have inherited an old code file that has following code. It seems the last line of the code below is removing all the open ( and close ) parentheses, and - character from the phone number field. Question : But why it is using regex='\(' in .replace(regex='\(',value='') part of that last line. Some other online examples (such as here and here ) I have seen don't seem to use regex keyword in their replacement function. What regex='\(' is doing in the replace function here?

import sqlalchemy as sq
import pandas as pd
import re

pw = dbutils.secrets.get(scope='SomeScope',key='sql')
engine = sq.create_engine('mssql+pymssql://SERVICE.Databricks.NONPUBLICETL:'+pw+'MyAzureSQL.database.windows.net:1433/TEST', isolation_level="AUTOCOMMIT")

 pandas_df = pd.read_sql('select * from SQLTable1', con=engine)

 pandas_df['MOBILE_PHONE'].replace(regex='\(',value='').replace(regex='\)',value='').replace(regex='\-',value='').str.strip()

You can use the regex keyword as a boolean to tell.replace() whether to interpret the string as regex or as the regular expression itself

The signature of the replace function has changed.

Replace your last line by:

df['MOBILE_PHONE2'] = df['MOBILE_PHONE'].replace('[()-]', '', regex=True).str.strip()
print(df)

# Output
     MOBILE_PHONE MOBILE_PHONE2
0  (425) 555-1234   425 5551234

Replace ( or ) or - by ''

Coding precision depends on experience, logic and mastery of syntax. Its like mastery of normal language. The answer you adapted achieves exactly what the code below does

df['MOBILE_PHONE2'] = df['MOBILE_PHONE'].str.replace('[^\d]','',regex=True)

Explanation

\d is regex for digits

[^] is regex for everything except

[^\d] everything except digits

So, using the pandas API, I replace everything except digits in the string with nothing

Outcome

    MOBILE_PHONE  MOBILE_PHONE2
0  (425) 555-1234    4255551234

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM