python-mysqldb : How to efficiently get millions/billions of records from database?

Question

I have a table from which I have to fetch around 7 million records, and this will go upto billion records too(since data is added everyday)
I am using mysql-python to connect to remote MySQL database
I query like the following

cursor = conn.cursor()
cursor.execute(query)
return cursor

and try to print them as

sql = 'select * from reading table;' # has 7 million records
cursor = MySQLDB.execute(sql)
for row in cursor:
        print row

It is taking forever to print it

On server, I see the process is running

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                                                                                                                                                                     
 3769 mysql     20   0 1120m 276m 5856 S  125  1.7   2218:09 mysqld

Question What is the efficient way of querying a table with {m,b}illions of records using python?

Thank you

Answer 1

I would suggest two options:

Direct the required data into a file with SELECT OUTFILE or even with a mysql console, and work with the file.
You should understand that by default, mysql sends the whole resultset to the client, and the client mimicks as if the data is read row by row (though the whole result is already in memory, or failed if there is not enough memory). Alternatively, the resultset can be formed on the server-side. For that you will need to add cursor=MySQLdb.cursors.SSCursor parameter to MySQLdb.connect (See http://mysql-python.sourceforge.net/MySQLdb.html for details).

python-mysqldb : How to efficiently get millions/billions of records from database?

Question

1 answers

solution1
5 ACCPTED 2012-03-12 21:50:01

python-mysqldb : How to efficiently get millions/billions of records from database?

Question

1 answers

solution1 5 ACCPTED 2012-03-12 21:50:01

solution1
5 ACCPTED 2012-03-12 21:50:01