I am merging different excel files into a csv file. Values in one of the columns(Length) in the source files contain single quote (eg '200, '50 etc.). Some values can also contain a period at the end(eg '200., '50., '10.3 etc.). I want to to remove only the single quote from the values.
Input
Length
=======
'2000
'100.
'10.3
Desired output
Length
=======
2000
100.
10.3
I am using the following code but somehow it also removes period(.) from the values. Please help.
import pandas as pd
import glob
path= input("Enter the location of files ")
GLB_DM_VER = input("Enter global DM version")
GLB_DM_ENV = input("Enter the global DM version environment")
file_list = glob.glob(path+"\*.xls")
excels = [pd.ExcelFile(name) for name in file_list]
frames = [x.parse(x.sheet_names[2], header=0,index_col=None) for x in excels]
combined = pd.concat(frames)
**combined['LENGTH'].replace(regex=True,inplace=True,to_replace=r'\'',value=r'')**
combined.to_csv("STAND_2.csv", header=['Global_DM_VERSION_ID','Global_DM_VERSION_ENV','TARGET_DOMAIN','SOURCE_DOMAIN','DOMAIN_LABEL','SOURCE_VARIABLE','RAVE_LABEL','TYPE','VARIABLE_LENGTH','CONTROL_TYPE','CODELIST_OID','TARGET_VARIABLE','MANDATORY','RAVE_ORIGIN'], index=False)
You can try with:
df['length'].str.replace("'","")
This will remove all the single quotes in the column
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.