I have a lot of data and I want to insert to DB at least time. I do some test. I create a table (using below script *1) with 21 column. 1 column is int, others 20 columns are string. There is no index. I write a test code, create a random values and insert to DB (using insert sql *2). Before to run sql command, call conn.setAutoCommit(false) than conn.commit(). This operation about 6-7 seconds. There is official document (*3)said that use "COPY" command for bulk insert. Create similar an ascii file and re-test it, this test finished about 5 seconds. Using same test code on same machine, insert same data to Mysql, test less than 1 second . I realy suprise that with respect to 6-7 seconds great performance improvement. Is this differ really exists or I overlook anything.
Thanks for helps
My test configuration is solaris 10 and PostgreSQL 9.0.2 and Mysql 5.0.85.
CREATE TABLE tablo
(
id integer,
column1 character varying(50),
column2 character varying(50),
column3 character varying(50),
....
column20 character varying(50)
)
WITH (
OIDS=FALSE
);
ALTER TABLE tablo OWNER TO pgadmin;
INSERT INTO tablo values (1,'column67062724628797','column26007603757271','column73982294239806','column43213154421324','column97722282440805','column79000889379973','column10680880337755','column14322827996050','column80720842739399','column22777514445036','column77771307997926','column92799724462613','column89992937353110','column61693061355353','column43804223262229','column62209656630047','column52150955786400','column85726157993572','column33358888005133','column77743799989746'),(2,'column77383691774831','column67841193885377','column36149612452454','column51161680852595','column91649734476301','column57283307765550','column14997046117948','column29457857794726','column91157683305554','column44413196495111','column40702778794938','column24744999726868','column38356057278249','column16808618337554','column64362413535503','column19577167594144','column72943639162993','column46830376244427','column01942608599939','column66479131355003'),
....
10K lines
(*3) Official PostgreSql document address http://www.postgresql.org/docs/8.3/interactive/populate.html
Seems odd that you're not seeing a speedup with things like using COPY. I generated a script to create a similar table and populate it with 10,000 rows, and found that:
Methods 2 and 3 were about 4 times faster than method 1. Method 4 was about 10 times faster than 2 or 3.
If I import the same data into mysql on my machine, it takes about half the time as methods 2 or 3. Dumping and reloading it, same. Dumping with -e and reloading it, same. Using InnoDB bumped the time up to being the same as methods 2 or 3.
So at least on my hardware/OS combination the speeds between the two are comparable... although of course I look after postgresql's settings better, but for a small table like this I wouldn't expect things like the buffer cache size to matter much?
Now, as to how good the JDBC support for doing batch inserts is, I have no idea. I did all these things using just the command-line clients.
There are two major considerations here:
So, unless bulk inserts from a single connection are the norm for your application, this test really doesn't show anything useful... It is more likely you will be using dozens of connections simultaneously to insert, query and/or update small chunks of data
In such way I have achieve 400000-500000 inserts per seconds with index creation on a 10 columns (2 xeon, 24 cores, 24 Gb of memory, SSD).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.