简体   繁体   中英

Using xlwt, create a new sheet anytime xls row limit is reached

I'm currently writing a python script that will take an arbitrary number of csv files and create .xls files from them. Unfortunately, some of these csv files have row counts greater than 65536, which means that they can't exist on one .xls sheet. What I would like to do is come up with a way to generate a new sheet when that number of rows is reached. For reference, here is the code I'm currently using:

import csv, xlwt, glob, ntpath

files = glob.glob("C:/Users/waldiesamuel/326/*.csv")
bold = xlwt.easyxf('font: bold on')

for i in files:
    org_file = open(i, 'r')
    reader = csv.reader((org_file), delimiter=",")
    workbook = xlwt.Workbook()
    sheet = workbook.add_sheet("SQL Results")

    path = ntpath.dirname(i)
    file = ntpath.basename(i)

    for rowi, row in enumerate(reader):

        for coli, value in enumerate(row):
            if coli == 0:
                sheet.write(rowi,coli,value,bold)
            else:
                sheet.write(rowi,coli,value)

    workbook.save(path + file + '.xls')

My thought is that around

for rowi, row in enumerate(reader):

I could use an if statement to check if row is greater than 65536, but I'm not sure how to create a new variable from there.

Edit:

I found a potential solution, which failed, and was explained by the answer. I'm including it here as an edit so everyone can follow the thought process:

So it appears that because xlwt checks to specifically make sure you're not adding more than 65536 rows, this might not be doable. I had come up with what I thought was a clever solution, by changing my sheet variable to a dict, like so:

sheet = {1: workbook.add_sheet("SQL Results")}

then initializing two variables to serve as counters:

sheet_counter = 1
dict_counter = 2

and then using that for a conditional within the first for loop that would reset the row index and allow xlwt to continue writing to a new sheet:

if rowi == 65536:
    sheet[dict_counter] = workbook.add_sheet("SQL Results (" + str(dict_counter) + ")")
    sheet_counter += 1
    dict_counter += 1
    rowi = 1
else:
    pass

Unfortunately, even doing so still causes xlwt to throw the following error when the row variable increments beyond 65536:

Traceback (most recent call last):
  File "xlstest.py", line 35, in <module>
    sheet[sheet_counter].write(rowi,coli,value,bold)
  File "C:\Users\waldiesamuel\AppData\Local\Programs\Python\Python35-32\lib\site-packages\xlwt\Worksheet.py", line 1088, in write
    self.row(r).write(c, label, style)
  File "C:\Users\waldiesamuel\AppData\Local\Programs\Python\Python35-32\lib\site-packages\xlwt\Worksheet.py", line 1142, in row
    self.__rows[indx] = self.Row(indx, self)
  File "C:\Users\waldiesamuel\AppData\Local\Programs\Python\Python35-32\lib\site-packages\xlwt\Row.py", line 43, in __init__
    raise ValueError("row index was %r, not allowed by .xls format" % rowx)
ValueError: row index was 65537, not allowed by .xls format

xlwt is

a library for developers to use to generate spreadsheet files compatible with Microsoft Excel versions 95 to 2003. (see here )

In those excel versions the maximal number of rows is limited by 65536. See here .

Try XlsxWriter which is compliant with Excel 2007 and number of rows can be up to 1,048,576.

The problem with your solution is that you are trying to reset rowi (which comes from your enumerate() statement) back to 1, but it is reset on the next loop.

The easiest way to achieve what you want, I think, is to change the way you reference rows and sheets. You can use the floor division and modulo operators to give you the sheet number and row numbers respectively.

if rowi % 65536 == 0:
    sheet[dict_counter] = workbook.add_sheet("SQL Results (" + str(dict_counter) + ")")
    sheet_counter += 1 # Not sure if you use this anywhere else - it can probably go
    dict_counter += 1
else:
    pass

sheetno = rowi // 65536
rowno = rowi %% 65536
sheet[sheetno].write(rowno,coli,value,bold)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM