简体   繁体   中英

Massive MySQL update best approach?

I need to update the stock levels in my MySQL database 3 times a day from a CSV file.

The CSV has over 27,000 products in there to be update and as you can imagine it takes a little while.

I currently have a php script that runs the following:

select * from products where product_code = "xxxxxxx";
if num_rows > 0
    if new_stock_level = 0
        UPDATE products SET `stock` = 0, `price` = 9.99 where product_code = "xxxxxxx";
    else
        UPDATE products SET `stock` = 50, `price` = 9.99, `stock_date` = now() where product_code = "xxxxxxx";

This is all well and good if you are updating < 50 items but not 27,000!

What would be the best way to do an update of this scale?

I have been doing some reasearch and from what i can see mysqli prepared statments seem to be where i should be heading.

After trying some of the bits mentioned below and what i have read online i have had the follwoing results with a batch of 250 updates.

Changing from InnoDB to MyISAM on average increased the number of ubdate per sec from 7 to 27 which is a massive increase to start with.

Prepareing the statment with case 9-10 sec

## Prepare the statment.
$stmt = $mysqli->prepare("UPDATE products SET stock = case ? when 0 then 0 else ? end, price = ?, stock_date = case ? when 0 then stock_date else now() end WHERE product_code = ?");
$stmt->bind_param('dddds', $stock, $stock, $price, $stock, $prod);
$stmt->execute();

Non Prepared statment 9-10 sec

$sql = "UPDATE products SET stock = case " . $stock . " when 0 then 0 else " . $stock . " end, price = " . $price . ", stock_date = case " . $stock . " when 0 then stock_date else now() end WHERE product_code = \"" . $prod . "\";\n";
$mysqli->query($sql);

grouping statments in 50's and excuting with multi_query 9-10 sec

$mysqli->multi_query($sql);

Non prepared with 2 seperate querys depending if im updating the stock date or not. 8-9 Sec

if($stock > 0)
{
    $sql = "UPDATE products SET stock = " . $stock . ", price = " . $price . ", stock_date = now() WHERE product_code = \"" . $prod . "\";\n";
}
else
{   
    $sql = "UPDATE products SET stock = " . $stock . ", price = " . $price . " WHERE product_code = \"" . $prod . "\";\n";
}
$mysqli->query($sql);

prepared version of the same 8-9 sec

## Prepare statments
$stmt1 = $mysqli->prepare("UPDATE products SET stock = ?, price = ?, stock_date = now() WHERE product_code = ?;");
$stmt1->bind_param('dds',$stock, $price, $prod);
$stmt2 = $mysqli->prepare("UPDATE products SET stock = ?, price = ? WHERE product_code = ?;");
$stmt2->bind_param('dds', $stock, $price, $prod);

if($stock > 0)
{
    $stmt1->execute();
}
else
{   
    $stmt2->execute();
}

I aslo tried addeding an additional processor to the VPS and it made it about 4 querys a secound faster.

You can use MySQL's CSV storage engine to make a table that accesses your CSV file directly. No need to import it.

Then you can use multi-table UPDATE syntax to join the CSV table directly to your products table using the product_code column. Then you can update columns of products based on the columns read from the CSV table.

Personally I would upload the updates into a temporary table create a unique key on the product_code field then run an update like this...

UPDATE tmptable p, products pp 
SET pp.stock = p.stock,
    pp.price = p.price,
    pp.stock_date = if(p.stock == 0, now(), pp.stock_date)
WHERE pp.product_code = p.product_code

A couple things about this ...

1. you can do this with one sql statement 
UPDATE products 
SET stock = case new_stock_level when 0 then 0 else new_stock_level end, 
    price = 9.99,
    stock_date = case new_stock_level when 0 then stock_date else now() end
WHERE product_code = "xxxxxxx";

2. you might want to try wrapping the statements inside of a transaction:
e.g.
START TRANSACTION
UPDATE products ...;
UPDATE products ...;
... ;
COMMIT TRANSACTION

These two things should speed it up.

Ok, I know this is not directly an answer to your question, but I'd like to suggest a different approach.

Instead of trying to update the entire stock level, try updating only the things that have changed since the last update? You could use some sort of change time to track it. This would heavily depend on your environment, but potentially selecting current stocks and compering them against the csv file (or the other way around) could actually be faster that updating every single record. Of course, this could be a complete waste of time, but there is only one way to find out...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM