[英]Text to Columns in Excel using Python using openpyxl
I am trying to do the "text-to-columns" feature from Excel through Python using openpyxl.我正在尝试使用 openpyxl 通过 Python 从 Excel 执行“文本到列”功能。 The file I have is currently saved as a .xlsx.
我拥有的文件当前保存为 .xlsx。 I cannot use the split() feature because my data is numbers not words.
我不能使用 split() 功能,因为我的数据是数字而不是单词。 I have tried pandas but it does not work.
我尝试过熊猫,但它不起作用。 I ran into the problem of having to download xldr but I cannot because I am using an older version of python due to a sdk I need.
我遇到了必须下载 xldr 的问题,但我不能,因为由于我需要一个 sdk,我使用的是旧版本的 python。 Is there a way to do text-to-columns in python using openpyxl when the data is numbers?
当数据是数字时,有没有办法在 python 中使用 openpyxl 进行文本到列? I am using Python version 2.7.18.
我正在使用 Python 2.7.18 版。
I would like to open the file from my directory, grab the column that all the data's in (not by name but by cell [ex. cell A]), delimit the number data by semicolons, then save the file.我想从我的目录中打开文件,获取所有数据所在的列(不是按名称,而是按单元格 [例如单元格 A]),用分号分隔数字数据,然后保存文件。
Here is the Data file: Excel Data这是数据文件: Excel数据
Here is the code I have:这是我的代码:
text-to-column Code picture文本到列代码图片
text-to-column Code doc文本到列代码文档
Thank you!谢谢!
I am not entirely sure what you are trying to do.我不完全确定您要做什么。 But, based on the code and the heading, my understanding is that you want to read ALL files in a particular directory (all files in that folder should be excel and should have
;
separated single column of data in column A), convert the text to column and then write back to the same file.但是,根据代码和标题,我的理解是您要读取特定目录中的所有文件(该文件夹中的所有文件都应该是 excel 并且应该有
;
在 A 列中分隔单列数据),转换文本列,然后写回同一个文件。
So, below code will do this:因此,下面的代码将执行此操作:
excelFiles
arrayexcelFiles
数组;
;
import openpyxl
import os
import sys
import pandas as pd
# Main
if __name__ == "__main__":
# This is the FOLDER where all your excel files are...
# Two back slashes as backslash is escape character
filePath = 'C:\\Users\\potomis1\\PycharmProjects\\MUSE2022\\'
# Go inside the folder
os.chdir(filePath)
# Get the list of Excel files inside the folder
excelFiles = os.listdir('.')
# For each Excel file
for i in range(0, len(excelFiles)):
df = pd.read_excel(excelFiles[i], header=None)
# Code to separate data
df = df[0].str.split(';',expand=True)
# Save and close the workbook
df.to_excel(excelFiles[i], header=None, index=False)
print(excelFiles[i] + ' sorted.')
# Code finishes, close the program - NOT REQUIRED
# sys.exit()
Update using openpyxl instead of read_excel使用 openpyxl 而不是 read_excel 更新
import openpyxl
from openpyxl.utils.dataframe import dataframe_to_rows
import os
import sys
import pandas as pd
# Main
if __name__ == "__main__":
# ??? Two back slashes as backslash is escape character
filePath = 'C:\\Users\\potomis1\\PycharmProjects\\MUSE2022\\'
# Go inside the folder
os.chdir(filePath)
# Get the list of Excel files inside the folder
excelFiles = os.listdir('.')
print('ExcelFiles ', excelFiles)
# For each Excel file
for i in range(0, len(excelFiles)):
# Code to separate data
Work_Book = openpyxl.load_workbook(filename=excelFiles[i])
Work_Sheet = Work_Book.active
df = pd.DataFrame(list(Work_Sheet.values))
df = df[0].str.split(';',expand=True)
# Save and close the workbook
rows = dataframe_to_rows(df, index=False, header=None)
for r_idx, row in enumerate(rows, 1):
for c_idx, value in enumerate(row, 1):
Work_Sheet.cell(row=r_idx, column=c_idx, value=value)
Work_Book.save(filename=excelFiles[i])
print(excelFiles[i] + ' sorted.')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.