Say I'm printing numbers from two arrays into a file:
from numpy import random
number_of_points = 10000
a = random.rand(number_of_points)
b = random.rand(number_of_points)
fh = open('file.txt', 'w')
for i in range(number_of_points):
for j in range(number_of_points):
print('%f %f' % (a[i], b[j]), file=fh)
I feel this is making lots of calls to the system to print, whereas sending one call containing this information would be faster. Is this correct? If so, how could I do this? Are there faster ways to implement this?
print
has a lot of bells and whistles you're not using, and you're using C-style looping with indexing instead of direct iteration, both of which add needless overhead. You might be able to speed it up a bit by limiting the Python level work, pushing it to the C layer.
For example, in this case, you could replace the whole doubly-nested loop structure with:
import itertools
# You could use '%f %f\n'.__mod__ as the map function if you like, I just
# find the modern format strings a little nicer
fh.writelines(itertools.starmap('{} {}\n'.format, itertools.product(a, b)))
which uses product
to produce the results of your nested loops and indexing directly, starmap
+ str.format
to create the lines, and fh.writelines
to exhaust the generator created by starmap
, writing all of its outputs directly to the file with a single function call, instead of 100,000,000 calls to to print
.
Aside from the fixed (unrelated to number of items iterated) setup cost to create the generators and pass the final generator to fh.writelines
, the actual iteration, formatting and I/O work will take place entirely at the C layer on the CPython reference interpreter, so it should run quite fast.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.