简体   繁体   中英

Batch INSERT into 2 related tables while avoid SQL Injection

I'm using Java 8, JDBC and MySql. I want to insert a large amount of data (2,000 rows) into 2 tables. The tables have a 1 to 1 relationship. First table is order_items :

| id      | amount          |
|:--------|----------------:|
| 1       | 20              |
| 2       | 25              |
| 3       | 30              |

Second table is delivery_details :

| orderItemId     | message    |
|----------------:|:-----------|
| 1               | hello.     |
| 2               | salut.     |
| 3               | ciao.      |

orderItemId is a foreign key to order_items .

The data is represented in this class:

public class OrderItemDelivery {

    @SerializedName("amount")
    private BigDecimal amount = null;

    @SerializedName("message")
    private String message = null;


    // getters and setters below
    ...
    ...

}

I need to execute the inserts as a batch to cut execution time. List<OrderItemDelivery> orderItemDeliveries contains 2,000 items. My current code is:

Connection connection = this.hikariDataSource.getConnection();
connection.setAutoCommit(false);
Statement statement = connection.createStatement();


for (int x = 0; x < orderItemDeliveries.size(); x++) {

    sql = String.format("INSERT INTO order_items (amount) VALUES ('%s')", orderItemDelivery.getAmount());
    statement.addBatch(sql);

    sql = String.format("INSERT INTO `delivery_details` (`orderItemId`, `message`) VALUES (LAST_INSERT_ID(), '%s')", orderItemDelivery.getMessage());        
    statement.addBatch(sql);

}

statement.executeBatch();
statement.close();
connection.setAutoCommit(true);
connection.close();

This is really efficient, but the limitation here is it's open to SQL Injection. If I was to use PreparedStatement , I would need one for the order_items batch and one for the delivery_details batch. And then LAST_INSERT_ID() would not work.

Is there any way around this? From what I've seen, there isn't. And I need to prevent SQL Injection by sanitising the message and amount with Java, which appears to have limitations. For example message can contain apostrophies and emojis. Can anyone think of another solution?

EDIT

Here's a really efficient solution I've come up with:

String orderItemSql = "INSERT INTO order_items (amount) VALUES (?) ";

for (int x = 1; x < orderItemDeliveries.size(); x++) {
    orderItemSql += ", (?)";
}

PreparedStatement preparedStatement = connection.prepareStatement(orderItemSql, Statement.RETURN_GENERATED_KEYS);

int i = 1;
for (int x = 0; x < orderItemDeliveries.size(); x++) {

    preparedStatement.setDouble(i++, orderItemDelivery.getAmount().doubleValue());

}

preparedStatement.executeUpdate();
Long ids[] = new Long[orderItemDeliveries.size()];

ResultSet rs = preparedStatement.getGeneratedKeys();
int x = 0;
while (rs.next()) {
    ids[x] = rs.getLong(1);
    x++;
}


String deliveryDetails = "INSERT INTO `delivery_details` (`orderItemId`, `message`) VALUES (?, ?)";
for (x = 1; x < orderItemDeliveries.size(); x++) {
    deliveryDetails += ", (?)";
}

preparedStatement = connection.prepareStatement(deliveryDetails);

i = 1;
for (x = 0; x < orderItemDeliveries.size(); x++) {
    orderItemDelivery = orderItemDeliveries.get(x);

    preparedStatement.setLong(i++, ids[x]);
    preparedStatement.setString(i++, orderItemDelivery.getMessage());
}

preparedStatement.executeUpdate();

So for this to work, the order of the ids must be sequential, and the order of orderItemDeliveries must not change between the first loop through of the list and the second.

This feels a bit hacky, but it works. Am I missing something?

Here's what I ended up doing, using getGeneratedKeys() :

String orderItemSql = "INSERT INTO order_items (amount) VALUES (?) ";

for (int x = 1; x < orderItemDeliveries.size(); x++) {
    orderItemSql += ", (?)";
}

PreparedStatement preparedStatement = connection.prepareStatement(orderItemSql, Statement.RETURN_GENERATED_KEYS);

int i = 1;
for (int x = 0; x < orderItemDeliveries.size(); x++) {

    preparedStatement.setDouble(i++, orderItemDelivery.getAmount().doubleValue());

}

preparedStatement.executeUpdate();
Long ids[] = new Long[orderItemDeliveries.size()];

ResultSet rs = preparedStatement.getGeneratedKeys();
int x = 0;
while (rs.next()) {
    ids[x] = rs.getLong(1);
    x++;
}


String deliveryDetails = "INSERT INTO `delivery_details` (`orderItemId`, `message`) VALUES (?, ?)";
for (x = 1; x < orderItemDeliveries.size(); x++) {
    deliveryDetails += ", (?)";
}

preparedStatement = connection.prepareStatement(deliveryDetails);

i = 1;
for (x = 0; x < orderItemDeliveries.size(); x++) {
    orderItemDelivery = orderItemDeliveries.get(x);

    preparedStatement.setLong(i++, ids[x]);
    preparedStatement.setString(i++, orderItemDelivery.getMessage());
}

preparedStatement.executeUpdate();

So for this to work, the order of the ids must be sequential, and the order of orderItemDeliveries must not change between the first loop through of the list and the second.

This feels a bit hacky, but it works.

Is it even possible with PreparedStatement?

Good point, but since it is a 1:1 relationship you could use a separate sequence or AUTO_INCREMENT keys for each table, not last_insert_id() , given that they generate the same values for correlated records. In an oltp setting with concurrent transactions I wouldn't do that, but since you're batching anyway that may be reasonable. You could force exclusive access by locking both tables exclusively in advance, if that's an option.

Letting the application track the key values is also an option instead of using one autoinc field. Unfortunately mysql doesn't allow to select directly the next value from a sequence, as opposed to Oracle. Eg this way: use a MAXKEY table with field MAX. Say you want to insert 10 rows, MAX is at 200. lock MAXKEY exclusively, select MAX (now you know, your keys can start with 200 + 1), update MAXKEY to 200 + 10, commit (releasing the lock). use 201...210 for 2 sets of batched inserts with prepared queries.

You could use a stored procedure to accept the values for both tables and insert seperately in bot of them (see this ), again using last_insert_id() , and call the procedure in batched fashion (see this ).

Eventually there are sql sanitizers, perhaps something on the line of org.apache.commons.lang.StringEscapeUtils.escapeSlq() may do.

But prepared statements also add other optimizations. The sql gets only sent once to the server, together with a 2-dimensional array of values. The parsed query can be cached and reused for subsequent calls. You should be able to see some more performance improvement just from that.

The string concatenation version sends the whole query for each row, all of them are different, need to be parsed and can't be found in the cache.

I suggest you to try this. Even if it is not a batch approach, it is based upon PreparedStatement , which will always get a better performance over inlined SQL:

private void insertItems(Connection connection, Collection<OrderItemDelivery> orderItemDeliveries)
    throws SQLException
{
    try (PreparedStatement pst1=connection.prepareStatement("INSERT INTO order_items (amount) VALUES (?)", new String[] { "id"});
        PreparedStatement pst2=connection.prepareStatement("INSERT INTO delivery_details(orderItemId, message) VALUES (?, ?)"))
    {
        for (OrderItemDelivery orderItemDelivery : orderItemDeliveries)
        {
            pst1.setString(1, orderItemDelivery.getAmount());
            int x=pst1.executeUpdate();
            if (x != 1)
            {
                throw new SQLException("Row was not inserted");
            }
            try (ResultSet rs=pst1.getGeneratedKeys())
            {
                if (rs.next())
                {
                    long id=rs.getLong(1);
                    // TODO Fill the values in 2nd prepared statement and call executeUpdate().
                }
                else
                {
                    throw new SQLException("Id was not generated");
                }
            }
        }
    }
}

Note: You must try it first; not all db vendors implement the getGeneratedKeys method. In case yours does not, just replace the generated keys piece by a call to LAST_INSERT_ID : It should work the same.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM