简体   繁体   中英

How can I write a large csv file using Python?

I need to extract a big amount of data(>1GB) from a database to a csv file. I'm using this script:

rs_cursor = rs_db.cursor()
rs_cursor.execute("""SELECT %(sql_fields)s
                     FROM table1""" % {"sql_fields": sql_fields})
sqlData = rs_cursor.fetchall()
rs_cursor.close()

c = csv.writer(open(filename, "wb"))
c.writerow(headers)
for row in sqlData:
    c.writerow(row)

The problem comes when is writing the file the system runs out of memory. In this case, is there any other and more efficient way to create a large csv file?

psycopg2 (which OP uses) has a fetchmany method which accepts a size argument. Use it to read a certain number of lines from the database. You can expirement with the value of n to balance between run-time and memory usage.

fetchmany docs: http://initd.org/psycopg/docs/cursor.html#cursor.fetchmany

    rs_cursor = rs_db.cursor()
    rs_cursor.execute("""SELECT %(sql_fields)s
                         FROM table1""" % {"sql_fields": sql_fields})
    c = csv.writer(open(filename, "wb"))
    c.writerow(headers)

    n = 100
    sqlData = rs_cursor.fetchmany(n)

    while sqlData:
        for row in sqlData:
            c.writerow(row)
        sqlData = rs_cursor.fetchmany(n)

   rs_cursor.close()


You can also wrap this with a generator to simplify the code a little bit:

def get_n_rows_from_table(n):
    rs_cursor = rs_db.cursor()
    rs_cursor.execute("""SELECT %(sql_fields)s
                             FROM table1""" % {"sql_fields": sql_fields})
    sqlData = rs_cursor.fetchmany(n)

    while sqlData:
        yield sqlData
        sqlData = rs_cursor.fetchmany(n)
    rs_cursor.close()

c = csv.writer(open(filename, "wb"))
c.writerow(headers)

for row in get_n_rows_from_table(100):
    c.writerow(row)

Have you tried fetchone()?

rs_cursor = rs_db.cursor()
rs_cursor.execute("""SELECT %(sql_fields)s
                     FROM table1""" % {"sql_fields": sql_fields})

c = csv.writer(open(filename, "wb"))
c.writerow(headers)
row = rs_cursor.fetchone()
while row:
    c.writerow(row)
    row = rs_cursor.fetchone()

rs_cursor.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM