[英]openpyxl: assign value or apply format to a range of Excel cells without iteration
[英]openPyXL - assign value to range of cells during unmerge
所以我有一个 excel 文件,每个文件都有几个工作表,我正在编写脚本,如果它们存在于文件中,它将从选定的工作表中收集数据并将其合并到一个大工作表中。 通常它正在工作,遍历文件,如果存在需要的工作表,它会找到包含数据的单元格范围并将其附加到数据帧。 我现在需要做的是向 Dataframe 添加标题行(列名),但在工作表中这些是多行标题。
为了使其在数据框中看起来相同,我需要取消合并顶部标题行中的单元格,并将值从第一个单元格复制到之前合并的范围内的其余单元格)。
我正在使用 OpenPyXL 来访问 Excel 表格。 我的函数接收工作表作为唯一参数。 它看起来像这样:
def checkForMergedCells(sheet):
merged = ws.merged_cell_ranges
for mergedCell in merged:
mc_start, mc_stop = str(mergedCell).split(':')
cp_value = sheet[mc_start]
sheet.unmerge_cells(mergedCell)
cell_range = sheet[mergedCell]
for cell in cell_range:
cell.value = cp_value
问题是 cell_range 返回一个元组,最终得到错误消息:
AttributeError: 'tuple' object has no attribute 'value' 下面您可以在调试期间看到屏幕截图,其中显示了在每个变量中传递的值。
按索引访问通常会返回一个元组元组,除非您尝试获取单个单元格或行。 对于编程访问,您应该使用iter_rows()
或iter_cols()
您可能想花一些时间查看utils
模块。
from openpyxl.utils import range_boundaries
for group in ws.merged_cell_ranges:
min_col, min_row, max_col, max_row = range_boundaries(group)
top_left_cell_value = ws.cell(row=min_row, column=min_col).value
for row in ws.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
for cell in row:
cell.value = top_left_cell_value
以前的答案都不起作用。 所以我详细阐述了这个,测试了它,它对我有用。
from openpyxl.utils import range_boundaries
wb = load_workbook('Example.xlsx')
sheets = wb.sheetnames ##['Sheet1', 'Sheet2']
for i,sheet in enumerate(sheets):
ws = wb[sheets[i]]
# you need a separate list to iterate on (see explanation #2 below)
mergedcells =[]
for group in ws.merged_cells.ranges:
mergedcells.append(group)
for group in mergedcells:
min_col, min_row, max_col, max_row = group.bounds
top_left_cell_value = ws.cell(row=min_row, column=min_col).value
ws.unmerge_cells(str(group)) # you need to unmerge before writing (see explanation #1 below)
for irow in range(min_row, max_row+1):
for jcol in range(min_col, max_col+1):
ws.cell(row = irow, column = jcol, value = top_left_cell_value)
@Дмитро Олександрович 几乎是对的,但我不得不改变一些事情来修复他的答案:
您将遇到AttributeError: 'MergedCell' object attribute 'value' is read-only
错误,因为您需要在更改其值之前取消合并合并的单元格。 (见这里: https : //foss.heptapod.net/openpyxl/openpyxl/-/issues/1228 )
您不能直接在 ws.merged_cells.ranges 上迭代,因为在 python 中迭代“范围”列表对象并更改它(例如使用unmerge_cells
函数或pop
函数)将导致仅更改一半的对象(请参见此处: https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1085 )。 您需要创建一个不同的列表并对其进行迭代。
以下来自http://thequickblog.com/merge-unmerge-cells-openpyxl-in-python/ 的代码对我有用。
import openpyxl
from openpyxl.utils import range_boundaries
wbook=openpyxl.load_workbook("openpyxl_merge_unmerge.xlsx")
sheet=wbook["unmerge_sample"]
for cell_group in sheet.merged_cells.ranges:
min_col, min_row, max_col, max_row = range_boundaries(str(cell_group))
top_left_cell_value = sheet.cell(row=min_row, column=min_col).value
sheet.unmerge_cells(str(cell_group))
for row in sheet.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
for cell in row:
cell.value = top_left_cell_value
wbook.save("openpyxl_merge_unmerge.xlsx")
exit()
在我这样做之前,我收到了错误和弃用警告:
from openpyxl.utils import range_boundaries
for group in sheet.merged_cells.ranges: # merged_cell_ranges deprecated
display(range_boundaries(group._get_range_string())) # expects a string instead of an object
min_col, min_row, max_col, max_row = range_boundaries(group._get_range_string())
top_left_cell_value = sheet.cell(row=min_row, column=min_col).value
for row in sheet.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
for cell in row:
cell.value = top_left_cell_value
关于@Charlie Clark 的选定答案和其他使用http://thequickblog.com/merge-unmerge-cells-openpyxl-in-python代码的答案,您可以更轻松地取消合并单元格,而无需处理range_boundaries
和这些转换。
就我而言,我想像这样取消合并单元格,同时复制边框、字体和对齐信息:
+-------+------+
+-------+------+ | Date | Time |
| Date | Time | +=======+======+
+=======+======+ -> | Aug 6 | 1:00 |
| | 1:00 | +-------+------+
| Aug 6 | 3:00 | | Aug 6 | 3:00 |
| | 6:00 | +-------+------+
+-------+------+ | Aug 6 | 6:00 |
+-------+------+
对于当前最新版本的openpyxl==3.0.9
,我发现以下最适合我:
from copy import copy
from openpyxl import load_workbook, Workbook
from openpyxl.cell import Cell
from openpyxl.worksheet.cell_range import CellRange
from openpyxl.worksheet.worksheet import Worksheet
def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
"""
Unmerges all merged cells in the given ``worksheet`` and copies the content
and styling of the original cell to the newly unmerged cells.
:param worksheet: The Excel worksheet containing the merged cells.
"""
# Must convert iterator to list to eagerly evaluate all merged cell ranges
# before looping over them - this prevents unintended side-effects of
# certain cell ranges from being skipped since `worksheet.unmerge_cells()`
# is destructive.
all_merged_cell_ranges: list[CellRange] = list(
worksheet.merged_cells.ranges
)
for merged_cell_range in all_merged_cell_ranges:
merged_cell: Cell = merged_cell_range.start_cell
worksheet.unmerge_cells(range_string=merged_cell_range.coord)
# Don't need to convert iterator to list here since `merged_cell_range`
# is cached
for row_index, col_index in merged_cell_range.cells:
cell: Cell = worksheet.cell(row=row_index, column=col_index)
cell.value = merged_cell.value
# (Optional) If you want to also copy the original cell styling to
# the newly unmerged cells, you must use shallow `copy()` since
# cell style properties are proxy objects which are not hashable.
#
# See <https://openpyxl.rtfd.io/en/stable/styles.html#copying-styles>
cell.alignment = copy(merged_cell.alignment)
cell.border = copy(merged_cell.border)
cell.font = copy(merged_cell.font)
# Sample usage
if __name__ == "__main__":
workbook: Workbook = load_workbook(
filename="workbook_with_merged_cells.xlsx"
)
worksheet: Worksheet = workbook["My Sheet"]
unmerge_and_fill_cells(worksheet=worksheet)
workbook.save(filename="workbook_with_unmerged_cells.xlsx")
这是一个较短的版本,没有注释,也没有复制样式:
from openpyxl.worksheet.worksheet import Worksheet
def unmerge_and_fill_cells(worksheet: Worksheet) -> None:
for merged_cell_range in list(worksheet.merged_cells.ranges):
worksheet.unmerge_cells(range_string=merged_cell_range.start_cell)
for row_col_indices in merged_cell_range.cells:
worksheet.cell(*row_col_indices).value = merged_cell.value
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.