简体   繁体   中英

Openpyxl: 'ValueError: Max value is 14' when using load_workbook

I have attempted to open an excel file in which I need to insert dataframes to certain sheets, while leaving other sheets alone. The script works fine when I tested it on other excel files. When I use it on the one I actually need, I get an error message.

Here is the script:

from openpyxl import load_workbook
book = load_workbook(self.directory)

Self.directory refers to my file location. As you can see in the traceback, it fails already at this line when trying to execute load_workbook(), and gives the following error message:

ValueError: Max value is 14

Here is the relevant traceback (I left the directory locations starting with the virtual environment folder 'virtual'):

"""
book = load_workbook(self.directory)
virtual\lib\site-packages\openpyxl\reader\excel.py", line 217, in load_workbook
shared_strings = read_string_table(archive.read(strings_path))
virtual\lib\site-packages\openpyxl\reader\strings.py", line 22, in read_string_table
text = Text.from_tree(node).content
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 84, in from_tree
obj = desc.expected_type.from_tree(el)
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 84, in from_tree
obj = desc.expected_type.from_tree(el)
virtual\lib\site-packages\openpyxl\styles\fonts.py", line 110, in from_tree
return super(Font, cls).from_tree(node)
virtual\lib\site-packages\openpyxl\descriptors\serialisable.py", line 100, in from_tree
return cls(**attrib)
virtual\lib\site-packages\openpyxl\cell\text.py", line 114, in __init__
self.family = family
virtual\lib\site-packages\openpyxl\descriptors\nested.py", line 36, in __set__ 6, in __set__
super(Nested, self).__set__(instance, value)
virtual\lib\site-packages\openpyxl\descriptors\base.py", line 110, in __set__ , in __set__ 
super(Min, self).__set__(instance, value)
virtual\lib\site-packages\openpyxl\descriptors\base.py", line 89, in __set__ in __set__
raise ValueError('Max value is {0}'.format(self.max))
ValueError: Max value is 14
"""

I realized that the excelfile I was using is over the limit of self.max.

I tried sifting through the the openpyxl scripts myself, but I could not manage to figure out what self.max refers to, or how I can change my Excel File so that I can load the workbook.

Can anyone point me to the right direction?

Thanks in advance!

I had to remove all formatting in a sheet I was working with.

In Libreoffice; select all, "clear direct formatting"

Here's what fixed this error for me. I edited lib\\site-packages\\openpyxl\\descriptors\\base.py and added a print statement after line 86 in class Max like so:

def __set__(self, instance, value):
    if ((self.allow_none and value is not None)
        or not self.allow_none):
        value = _convert(self.expected_type, value)
        if value > self.max:
            print(f"value is {value}")
            raise ValueError('Max value is {0}'.format(self.max))
    super(Max, self).__set__(instance, value)

This printed the value of 34 which is obviously higher than the max value of 14 (it's a font family value).

I then saved a copy of my Excel spreadsheet with a .zip extension, extracted all of the XML files, and then used grep searching for val="34". This led me to 3 cells which somehow had font-family=34 in them. I changed the font to something else in Excel, saved the spreadsheet, then changed it back to the original font (Arial) and saved.
After all of this, the error was gone.

删除工作表上第 15 个以上的小“评论框”后,我能够解决该错误。

the number of comment boxed didn't solve my problem. I had to remove some worksheets until I got below 14 worksheets in total to be able to open/read the document.

It is an excel file generated WPS but not MS office.

  1. you can use xlwings to open it.
  2. you can save to CSV file manually and read.

Issue is resolved if you suppress/commentout the exception like shown below in openpyxl:

def __set__(self, instance, value):
        if ((self.allow_none and value is not None)
            or not self.allow_none):
            value = _convert(self.expected_type, value)
            if value > self.max:
                self.max=self.max
                #raise ValueError('Max value is {0}'.format(self.max))
        super(Max, self).__set__(instance, value)

It resolved the issue and now I am able to use

pd.read_excel(io.BytesIO(obj['Body'].read()), engine='openpyxl', sheet_name=[0], header=None)

只需注释掉 openpyxl 中引发错误的代码行。

Instead of patching the __set__ method you can patch the specific descriptor's max value.

# IMPORTANT, you must do this before importing openpyxl
from unittest import mock
# Set max font family value to 100
p = mock.patch('openpyxl.styles.fonts.Font.family.max', new=100)
p.start()
import openpyxl
openpyxl.open('my-bugged-worksheet.xlsx') # this works now!

If you patch descriptors\\base.py you'll be allowing potentially bad values for all descriptors. This approach is more surgical in that it only patches the font family descriptor that is causing the error.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM