I have to insert 6 - 10 million rows fresh into a Postgresql Table. There is no way out of these inserts.
I can scale by using multiple processes running on multiple machines.
I would like to know Given a Intel i7 and 16GB memory Linux Machine, What is the maximum number of rows that can be inserted in 1 Minute using only one Java Process? Assuming TCP/IP tuning etc.
With this answer combined with my experiments, I can determine the number of processes per machine and the number of machines.
I have hard time hitting more than 2000 - 2500 rows per minute.
I am using Hibernate 5.2.5 with Stateless Sessions and just one thread.
When I run 10 such processes, I am able to do 20,000 - 25000 inserts/minute which is a linear increase in throughput.
I would like to know if I am doing anything wrong in my single process and whether the number is too low and How can I increase the per process throughput, to say, at least 10000/minute.
thsnk, Kumar
The answer is almost certainly to run all INSERT
s in a single transaction (per thread).
Unless you do that, each INSERT
runs in its own transaction, and that forces a flush to disk for the transaction log at the end of each insert, which will bring processing to a crawl.
If this is a one-time procedure, and you can live with the risk of losing your data (eg, you made a good backup before you start), you can alternatively set fsync=off
in PostgreSQL (but don't forget to reset it to its original value when you are done!).
I used a lowest end Intel CPU HP Laptop. One Process, One Thread: I was able to get close to 50,000 inserts per minute (60 seconds) with no tuning etc.
The original numbers were obtained in some other setup. I had my suspicions and hence I wanted to get answers from Stack Overflow. I got my answer and hence no need for further answers.
Thank you for the one answer. These rules must also be followed. In addition, when a System is developed, it must be possible to do distributed and parallel programs.
thanks, Kumar
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.