简体   繁体   English

如何使我的 function 用于我的数据库的测试数据更快?

[英]How do I make my function for testdata for my database faster?

I want to create testdata and have written a function for storing the products, my product generators generate in my database.我想创建测试数据并编写了一个 function 用于存储产品,我的产品生成器在我的数据库中生成。

The plan is to create about 10,000,000 products or more for testing purposes.该计划是为了测试目的创建大约 10,000,000 个或更多产品。

I want to check every time before I insert a product, if the same product name exists.我想在每次插入产品之前检查是否存在相同的产品名称。

If it does, the product isn't stored in the database.如果是,则产品不会存储在数据库中。 I know that the performance issue is the checking if the products exist, which takes longer and longer the more products are in the database.我知道性能问题是检查产品是否存在,数据库中的产品越多,这需要的时间就越长。 But there is no other way, I know, how I can improve this issue.但我知道,没有其他办法可以改善这个问题。 I may use indexes, but I don't know how to in this scenario.我可能会使用索引,但我不知道如何在这种情况下使用。 If you have other ideas how to improve performance please feel free to comment your ideas.如果您对如何提高性能有其他想法,请随时评论您的想法。

tldr: I want to create testdata but it does take too long because it is checking if the products already exist. tldr:我想创建测试数据,但它确实需要很长时间,因为它正在检查产品是否已经存在。 Want to improve performance.想提高性能。

Here is my code:这是我的代码:

   public String insertProdukt(String name, Double preis, Integer kat_id) throws SQLException, ClassNotFoundException {
    Connection connection = ConnectionUtils.createNewConnection();
    // does the product exist?
    Statement statement = connection.createStatement();
    ResultSet resultSet = statement.executeQuery("select * from pro_produkte where pro_name=\"" + name + "\" AND pro_preis=\"" + preis + "\" AND pro_kat_id=\"" + kat_id + "\"");

    if (resultSet.next()) {
        //it does exist
        System.out.println("Produkt: " + resultSet.getString("pro_name") + " existiert bereits");
    } else {
        //it dosen't -> insert into database
        String sql = "Insert INTO pro_produkte (pro_name, pro_preis, pro_kat_id)"
                + "VALUES (\"" + name + "\", \"" + preis + "\", \"" + kat_id + "\")";
        statement.executeUpdate(sql);
        System.out.println("Produkt: " + name + " erstellt");

    }
    resultSet.close();
    statement.close();
    connection.close();
    return null;
}

Thanks!谢谢!

Instead of simply INSERT... , use而不是简单地INSERT... ,使用

INSERT IGNORE ...

And have a UNIQUE (or PRIMARY ) that will catch the "duplicate".并且有一个UNIQUE (或PRIMARY )可以捕获“重复”。

INSERTing one row at a time is about 10 times as slow as inserting 100 rows at a time.一次INSERTing一行的速度大约是一次插入 100 行的 10 倍。 So, if you are generating them by code, do因此,如果您通过代码生成它们,请执行

INSERT IGNORE INTO t
    (col1, col2, ...)
    VALUES
    (1,2,...),
    (22,55,...),
    ... ;

Or或者

LOAD DATA LOCAL INFILE '...' IGNORE ...

if reading from a file.如果从文件中读取。

First thing - do not open a connection for every insert, unless you are using a connection pool.第一件事 - 不要为每个插入打开一个连接,除非您使用的是连接池。

Second thing - use PreparedStatement .第二件事 - 使用PreparedStatement Not only will this save you from SQL injection, it will also make it faster because it will avoid repetitive parsing.这不仅可以使您免于 SQL 注入,还可以使其更快,因为它可以避免重复解析。

Third thing - use PreparedStatement.addBatch() and commit a batch every 5000 rows (or something like that).第三件事 - 使用PreparedStatement.addBatch()并每 5000 行提交一批(或类似的东西)。 This implies you use the same Connection and PreparedStatement for all inserts.这意味着您对所有插入使用相同的 Connection 和 PreparedStatement。

Fourth thing - if you are only filling the database with test data and you know that your test data is unique, create index AFTER you insert all the records.第四件事 - 如果您只是用测试数据填充数据库并且您知道您的测试数据是唯一的,请在插入所有记录后创建索引。 It will be significantly faster.它会明显更快。

Fifth thing - if you are using InnoDB, make sure you have enough buffer space to keep entire index in memory, and put the database on SSD (~30x faster than HDD).第五件事 - 如果您使用 InnoDB,请确保您有足够的缓冲区空间来将整个索引保存在 memory 中,并将数据库放在 SSD 上(比 HDD 快约 30 倍)。

If you can do it outside Java, you can use database's proprietary features for bulk loading, restoring from backups or snapshots.如果您可以在 Java 之外执行此操作,则可以使用数据库的专有功能进行批量加载、从备份或快照恢复。 Check what features your database provides.检查您的数据库提供的功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM