[英]Fetching huge data from Oracle in Python
I need to fetch huge data from Oracle (using cx_oracle) in python 2.6, and to produce some csv file.我需要在 python 2.6 中从 Oracle(使用 cx_oracle)获取大量数据,并生成一些 csv 文件。
The data size is about 400k record x 200 columns x 100 chars each.数据大小约为 400k 记录 x 200 列 x 每列 100 个字符。
Which is the best way to do that?哪个是最好的方法?
Now, using the following code...现在,使用以下代码...
ctemp = connection.cursor()
ctemp.execute(sql)
ctemp.arraysize = 256
for row in ctemp:
file.write(row[1])
...
... the script remain hours in the loop and nothing is writed to the file... (is there a way to print a message for every record extracted?) ...脚本在循环中保持数小时,并且没有任何内容写入文件......(有没有办法为提取的每条记录打印一条消息?)
Note: I don't have any issue with Oracle, and running the query in SqlDeveloper is super fast.注意:我对 Oracle 没有任何问题,在 SqlDeveloper 中运行查询非常快。
Thank you, gian谢谢你,吉安
You should use cur.fetchmany()
instead.您应该使用cur.fetchmany()
代替。 It will fetch chunk of rows defined by arraysise (256)它将获取由 arraysise (256) 定义的行块
Python code:蟒蛇代码:
def chunks(cur): # 256
global log, d
while True:
#log.info('Chunk size %s' % cur.arraysize, extra=d)
rows=cur.fetchmany()
if not rows: break;
yield rows
Then do your processing in a for loop;然后在 for 循环中进行处理;
for i, chunk in enumerate(chunks(cur)):
for row in chunk:
#Process you rows here
That is exactly how I do it in my TableHunter for Oracle .这正是我在TableHunter for Oracle中所做的。
I think your code is asking the database for the data one row at the time which might explain the slowness.我认为您的代码当时正在向数据库询问一行数据,这可能解释了速度缓慢的原因。
Try:尝试:
ctemp = connection.cursor()
ctemp.execute(sql)
Results = ctemp.fetchall()
for row in Results:
file.write(row[1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.