OpenPyXL - 只读：如何在不知道何时发生的情况下跳过空行？

Question

i'm pretty new to programming so please bear with me if my code is not nice and the answer is too obvious.我对编程很陌生，所以如果我的代码不好并且答案太明显，请多多包涵。 :) :)

I want to parse an excel file into a directory so i can later access them via key.我想将 excel 文件解析到一个目录中，以便以后可以通过密钥访问它们。 I won't know how the excel file will be structured before parsing it.在解析之前，我不知道 excel 文件的结构。 So I can't just code it that way to skip a certain empty row since they will be random.所以我不能只是这样编码来跳过某个空行，因为它们是随机的。 For this, i am using Python 3 and OpenPyXl (Read Only).为此，我使用 Python 3 和 OpenPyXl（只读）。 This is my code:这是我的代码：

from openpyxl import load_workbook
import pprint


# path to file
c = "test.xlsx"
wb = load_workbook(filename=c, read_only=True, data_only=True)

# key for directory
data = {}
# list of worksheet names
wsname = []
# values in rows per worksheet
valuename = []


# took this odd numbers since pprint organizes the numbers weird when 1s and 10s are involved
# counter for row
k = 9
# counter for column
i = 10

# splits name of xlsx - file from .xlsx
workbook = c.split(".")[0]

data[workbook] = {}
for ws in wb.worksheets:
    # takes worksheet name and parses it into the wsname list
    wsname.append(ws.title)
    wsrealname = wsname.pop()
    worksheet = wsrealname
    data[workbook][worksheet] = {}
    for row in ws.rows:
        k += 1
        for cell in row:
            # reads value per row and column
            data[workbook][worksheet]["Row: " + str(k) + " Column: " + str(i)] = cell.value
            i += 1
        i = 10
    k = 9

pprint.pprint(data)

And with this i get output like this:有了这个我得到 output 像这样：

    {'test': {'Worksheet1': {'Row: 10 Column: 10': None,
                             'Row: 10 Column: 11': None,
                             'Row: 10 Column: 12': None,
                             'Row: 10 Column: 13': None,
                             'Row: 11 Column: 10': None,
                             'Row: 11 Column: 11': 'Test1',
                             'Row: 11 Column: 12': None,
                             'Row: 11 Column: 13': None}}}

Which is the Output i want, despite the fact they i want to skip in this example the whole Row 10, since all values are None and therefore empty.这是我想要的 Output，尽管事实上我想在这个例子中跳过整个第 10 行，因为所有值都是 None 因此为空。

As mentioned, I don't know when empty rows will occur so I can't just hardcode a certain row to be skipped.如前所述，我不知道什么时候会出现空行，所以我不能硬编码要跳过的某一行。 In Read Only Mode, if you print(row) there will be just 'EmptyCell' in the row like this:在只读模式下，如果您打印（行），则行中将只有“EmptyCell”，如下所示：

(<EmptyCell>, <EmptyCell>, <EmptyCell>, <EmptyCell>)

I tried to let my program check with set() whether there are duplicates in the row "values".我试图让我的程序用 set() 检查“值”行中是否有重复项。

if len(set(row)) == 1:
.....

but that doesn't solve this issue, since I get this Error Message:但这并不能解决此问题，因为我收到此错误消息：

TypeError: unhashable type: 'ReadOnlyCell'

If I compare the cell.value with 'None' and exlude all 'Nones', I get this Output:如果我将 cell.value 与“无”进行比较并排除所有“无”，我会得到这个 Output：

{'test': {'Worksheet1': {'Row: 11 Column: 11': 'Test1'}}}

which is not beneficial, since I just want just to skip cells if the whole row is empty.这是没有好处的，因为如果整行为空，我只想跳过单元格。 Output should be like that: Output 应该是这样的：

{'test': {'Worksheet1': {'Row: 11 Column: 10': None,
                     'Row: 11 Column: 11': 'Test1',
                     'Row: 11 Column: 12': None,
                     'Row: 11 Column: 13': None}}}

So, could you please help in figuring out how to skip cells only if the complete row (and therefore all cells) is empty?那么，您能否帮助弄清楚仅当完整行（以及所有单元格）为空时如何跳过单元格？

Thanks a lot!非常感谢！

Answer 1

from openpyxl.cell.read_only import EmptyCell

for row in ws:
     empty = all(isinstance(cell, EmptyCell) for cell in row) # or check if the value is None

NB. 注意 in read-only mode avoid multiple calls like data[workbook][worksheet]['A1'] as they will force the library to parse the worsheet again and again 在只读模式下，请避免多次调用，例如data[workbook][worksheet]['A1']因为它们将迫使库一次又一次地解析data[workbook][worksheet]['A1']

Answer 2

Just create your custom generator which would yield only not empty rows:只需创建您的自定义生成器，它只会产生非空行：

def iter_rows_with_data(worksheet):
    for row in worksheet.iter_rows(values_only=True):
        if any(row):
            yield row

OpenPyXL - 只读：如何在不知道何时发生的情况下跳过空行？

问题描述

2 个解决方案

解决方案1
0 2018-07-10 08:20:45

解决方案2
0 2022-08-12 14:22:47

OpenPyXL - 只读：如何在不知道何时发生的情况下跳过空行？

问题描述

2 个解决方案

解决方案1 0 2018-07-10 08:20:45

解决方案2 0 2022-08-12 14:22:47

解决方案1
0 2018-07-10 08:20:45

解决方案2
0 2022-08-12 14:22:47