I have an xlsx file with multiple worksheets.
I read it in & separate the worksheets into dataframes:
xls=pd.ExcelFile('path/to/multisheet_excelfile.xlsx')
dfs={sheet: pd.read_excel(xls,sheet) for i, sheet in enumerate(xls.sheet_names)}
I iterate thorugh the dataframes, & then iterate though the rows, to apply apply() :
for df in dfs.values():
for col in df.columns:
df[col] = df[col].apply(lambda name:
# apply some function here, let's say:
re.sub("[\[].*?[\]]", "", repr(name)))
Is there a better way to do this, possibly not involving a double for loop?
You can't do it without loops because pandas
creates DataFrame
from sheet
. But you can do it in 1 loop:
# {'sheet_name1': df1, 'sheet_name2': df2, ...}
dfs = pd.read_excel(xls, sheet_name=pd.ExcelFile('file_path').sheet_names) # type: dict
dfs = {
sheet_name: df.applymap(lambda x: re.sub("[\[].*?[\]]", "", repr(x))
for sheet_name, df in dfs.items()
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.