简体   繁体   中英

Python 3.7 64 bit on Windows 7 (64 bit): CSV - field larger than field limit (131072)

_csv.Error: field larger than field limit (131072) is not fixing my problem.

I have a script that processes CSV files into Excel reports. The script worked just fine until some particular CSV file became quite large (currently > 12 MB).

The script is usually running on Windows 7 64 bit since the team is using Windows clients. Python versions range from 3.6 to 3.7.2 - all 64 bit. All versions produce the error.

The error I'm getting first place is

_csv.Error: field larger than field limit (131072)

Which, using the search function, seems to be easy to fix. But when I include

csv.field_size_limit(sys.maxsize)

it makes it only worse:

Traceback (most recent call last):
  File "CSV-to-Excel.py", line 123, in <module>
    report = process_csv_report(infile)
  File "CSV-to-Excel.py", line 30, in process_csv_report
    csv.field_size_limit(sys.maxsize)
OverflowError: Python int too large to convert to C long

According to my research that bug should have long since be fixed.

My current workaround is using Linux where the code simply works fine. However the team that should be running the script can't run Linux but is locked on Windows.

The code of the script is

#!c:\python37\python.exe

import csv
import sys


def process_csv_report(CSV_report_file):
    files = []
    files.append(CSV_report_file+"_low.csv")
    files.append(CSV_report_file+"_med.csv")
    files.append(CSV_report_file+"_high.csv")
    first = True
    try:
        report = []
        for f in files:
            if first == True:
                with open(f, "r", newline='', encoding='utf-8') as csvfile:
                    original = csv.reader(csvfile, delimiter=',', quotechar='"')
                    for row in original:
                        report.append(row)
                first = False
            else:
                with open(f, "r", newline='', encoding='utf-8') as csvfile:
                    original = csv.reader(csvfile, delimiter=',', quotechar='"')
                    # for the second and third file skip the header line
                    next(original, None)
                    for row in original:
                        report.append(row)
    except Exception as e:
        print("File I/O error! File: {}; Error: {}".format(f, str(e)))
        exit(1)
    return report


if __name__ == "__main__":
    report = process_csv_report(infile)

As simple as it seems I am lost at fixing the issue since the solution working for others fails here for no reason I can see.

Has anybody seens this happen lately with a late version of Python?

You could replace sys.maxsize by the c integer max value , which is 2147483647 .

I know the sys.maxsize should take care of it, but I think using a value inferior to that roof, like 1.000.000 should resolve your issue.

A nicer way to do it could be min(sys.maxsize, 2147483646)

The _csv library is a compiled extension, then it uses the c variables.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM