Fetching huge data from Oracle in Python

Question

I need to fetch huge data from Oracle (using cx_oracle) in python 2.6, and to produce some csv file.

The data size is about 400k record x 200 columns x 100 chars each.

Which is the best way to do that?

Now, using the following code...

ctemp = connection.cursor()
ctemp.execute(sql)
ctemp.arraysize = 256
for row in ctemp:
  file.write(row[1])
  ...

... the script remain hours in the loop and nothing is writed to the file... (is there a way to print a message for every record extracted?)

Note: I don't have any issue with Oracle, and running the query in SqlDeveloper is super fast.

Thank you, gian

Answer 1

You should use cur.fetchmany() instead. It will fetch chunk of rows defined by arraysise (256)

Python code:

def chunks(cur): # 256
    global log, d
    while True:
        #log.info('Chunk size %s' %  cur.arraysize, extra=d)
        rows=cur.fetchmany()

        if not rows: break;
        yield rows

Then do your processing in a for loop;

for i, chunk  in enumerate(chunks(cur)):
            for row in chunk:
                     #Process you rows here

That is exactly how I do it in my TableHunter for Oracle .

Answer 2

add print statements after each line
add a counter to your loop indicating progress after each N rows
look into a module like 'progressbar' for displaying a progress indicator

Answer 3

I think your code is asking the database for the data one row at the time which might explain the slowness.

Try:

ctemp = connection.cursor()
ctemp.execute(sql)
Results = ctemp.fetchall()
for row in Results:
    file.write(row[1])

Fetching huge data from Oracle in Python

Question

3 answers

solution1
1 2017-03-15 19:13:40

solution2
0 2013-10-08 09:16:51

solution3
0 2013-10-08 12:04:40

Fetching huge data from Oracle in Python

Question

3 answers

solution1 1 2017-03-15 19:13:40

solution2 0 2013-10-08 09:16:51

solution3 0 2013-10-08 12:04:40

solution1
1 2017-03-15 19:13:40

solution2
0 2013-10-08 09:16:51

solution3
0 2013-10-08 12:04:40