简体   繁体   中英

python openpyxl max_row counts all the rows instead of counting non-empty rows

I am using a excel sheet where it has 999 in total of rows and in which 20 rows are data filled rows and others are empty.

so when i print max_rows it gives me 999 numbers instead of 20 number! i am following this tutorial - openpyxl tutorial

wb = openpyxl.load_workbook(path)
s = wb.active
print(s.max_row)

You'll need to count them yourself if you want to use openpyxl

wb = openpyxl.load_workbook(path)
ws = wb.active
count = 0
for row in ws:
    if not all([cell.value is None for cell in row]):
        count += 1

print(count)

Or

wb = openpyxl.load_workbook(path)
ws = wb.active
print(len([row for row in ws if not all([cell.value is None for cell in row])]))

Explanation

If a cell doesn't have any value in an xlsx it is None when you get its value. The check cell.value is None for cell in row will only trigger if a row doesn't have any data at all. You could easily swap all for any to not count rows with any empty fields.

That is expected. As per the docs max_row returns

The maximum row index containing data (1-based)

ie the maximum row index, not number of rows with data. If you have data just on eg row 100 you will get 100, not 1

I found that I had to use

if not all([  (cell.value == None or cell.value =='') for cell in row]):

to not count blank cells containing formatting. Otherwise, for .xlsx files, I would get a count of around 1048535.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM