简体   繁体   中英

Does php bindParam() speed up bulk insert?

I've got a .txt file containing 4,000 lines,I'm trying to insert them into mysql,here are two approaches which do the same thing,the first approach is simple coded like this:

$start = microtime(true);
foreach($b as $k=>$v){//$b is an array of 4,000 elements
    $db->exec("INSERT INTO siji (en,cn) VALUES ('$v[0]','$v[1]')");
}
echo microtime(true)-$start;//116 sec.

It takes 116 sec.The second way is using PDO::bindParam(),I know that for repeated SQL query it is a good practice to use bindparam() because the only difference between each query is their values,so I coded like this:

    $start = microtime(true);
$stmt = $db->prepare('INSERT INTO siji (en,cn) VALUES (:en,:cn)');
$stmt->bindParam(':en',$en);
$stmt->bindParam(':cn',$cn);
foreach($b as $k=>$v){//$b is an array of 4,000 elements
    $en = $v[0];
    $cn = $v[1];
    $stmt->execute();//
}
echo microtime(true)-$start;//127 sec.

The second approach is surposed to be faster than the first one,the result is not as what I thought to be though,Could anyone tell me does bindparam() really speed up bulk insertion?Or what could possibly be wrong when using bindparam()?

You haven't specified what database server you're using, so I'll assume MySQL, as it's the most common.

To directly answer your question: The answer is Yes, PDO's prepare function is supposed to use the DB's Prepared Statements functionality, which should result in much faster results when running a batch of similar queries like this.

However specifically with the MySQL PDO driver, it defaults to emulating prepared statements rather than actually using them properly.

This means that by default, inside of the PDO object, it's basically doing exactly the same thing as your first code example, building up the SQL string manually.

I don't know why this is the default behaviour (maybe there was a compatibility issue with older mySQL versions?), but to prevent it and to force PDO to use Prepared Statements properly, you need to disable this option.

You can do this as follows:

$dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES,false);

Try that, and see what happens.

By the way, if your .txt file with 4000 lines happens to be a CSV or other regularly formatted file, you could use MySQL's built-in LOAD DATA INFILE function, which can load an entire file into the DB via a single query. This is always much faster than anything you could achieve by looping the same query 4000 times in PHP. (Other DBs have similar functionality).

I've got a .txt file containing 4,000 lines,I'm trying to insert them into mysql

Use LOAD DATA INFILE then, if you are concerned of speed

Also, 100 seconds for the 4000 inserts is a way too much. you have to either wrap your inserts in a transaction or consider configuring your innodb into less paranoid mode .

The second approach is surposed to be faster than the first one,the result is not as what I thought to be though,Could anyone tell me does bindparam() really speed up bulk insertion?

It actually is faster. Just not necessarily for trivial queries like the one you posted.

This is a bit like benchmarking MySQL vs PostgreSQL. If you run a test with MyISAM tables that does trivial non-concurrent selects, your benchmark might decide that MySQL outperforms Postgres. If, however, you run hundreds of concurrent queries with a half-dozen joins, your benchmark might tell you a very different story.

In your case, you're preparing a trivial insert. It's trivial to parse the SQL; determining the optimal query plan is equally trivial. The benefit of preparing the statement is very slim. If, on the other hand, you have a couple of non-trivial triggers that fire on each insert, you'll likely get a very different story.

There's also something to be said about true prepares vs emulated prepares. Sometimes, a prepared statement doesn't give you an optimal plan. Consider this query:

select * from foo order by bar limit ?

If you prepare the above, the planner cannot decide whether to use an index on bar -- if bar is low enough, it'll make sense; if it's enormous, you might as well fetch the entire table and top-n sort it. And so the planner will pick the latter plan.

In contrast, if you send the final query directly, the planner will have all the elements it needs to decide whether using that same index makes sense or not for that particular value. In other words, an emulated prepare is occasionally better for queries that are only run a single time, or for trivial queries.

Oh, and don't forget to wrap the entire thing into a single transaction. That'll speed things up significantly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM