简体   繁体   English

嵌入式H2数据库:多线程访问

[英]Embedded H2 database: Multithreaded access

I have a program which parses some data out of HTML files and then does some computations on those files depending on the current date.我有一个程序可以从 HTML 文件中解析出一些数据,然后根据当前日期对这些文件进行一些计算。

Since the parsing of the files takes a lot of time and they are not changing, I am now storing the data in an embedded H2 database and only parsing it if the data from that file is not in the database already.由于文件的解析需要很多时间并且它们不会改变,我现在将数据存储在嵌入式 H2 数据库中,并且仅在该文件中的数据不在数据库中时才对其进行解析。 Otherwise I am just querying it from the database.否则我只是从数据库中查询它。 Still the initial parsing and also when there are new files, it takes a lot of time.仍然是初始解析以及当有新文件时,需要很多时间。

So I am now at a point where I am trying to multithread the parsing.所以我现在正试图对解析进行多线程处理。 I want to do this using the ExecutorService and splitting the files into multiple tasks (one per file is probably not a good idea, probably going to be 10-20 tasks to use all cores).我想使用 ExecutorService 来做到这一点,并将文件拆分为多个任务(每个文件一个可能不是一个好主意,可能需要 10-20 个任务才能使用所有内核)。 Now I don't know, how I should implement the database access.现在我不知道,我应该如何实现数据库访问。 Should I only have one connection and create the needed PreparedStatements per Task or should I use a connection pool (and also create the PreparedStatements per task) or something else?我应该只有一个连接并为每个任务创建所需的 PreparedStatements 还是应该使用连接池(并且还为每个任务创建 PreparedStatements)或其他什么?

In JDBC, the Connection object may have multiple Statement and PreparedStatement instances, but they can't do their work concurrently.在 JDBC 中, Connection对象可能有多个StatementPreparedStatement实例,但它们不能同时进行工作。 If you perform some commands concurrently, they will be executed one by one.如果你同时执行一些命令,它们会被一个一个地执行。 It's safe if they don't depend on each other and you don't modify settings of the connection in them, but your threads will wait for completion of commands from other threads.如果它们不相互依赖并且您不修改它们中的连接设置,则是安全的,但是您的线程将等待来自其他线程的命令完成。 If your commands don't work fast enough these additional delays can be noticeable.如果您的命令运行速度不够快,这些额外的延迟可能会很明显。 In that case you can create an own Connection per thread and create Statement or PreparedStatement instances for this thread from its own connection.在这种情况下,您可以为每个线程创建一个自己的Connection并从其自己的连接为该线程创建StatementPreparedStatement实例。 Different connections can execute commands concurrently.不同的连接可以同时执行命令。

In H2 you also need a recent version of the database and you need to use the default MVStore engine.在 H2 中,您还需要最新版本的数据库,并且需要使用默认的 MVStore 引擎。 The legacy PageStore engine ( ;MV_STORE=FALSE ) runs in single-threaded mode.旧版 PageStore 引擎 ( ;MV_STORE=FALSE ) 以单线程模式运行。 Older versions of H2 also use single-threaded mode by default with both storage engines and it's not fully safe to enable multi-threading in them explicitly.旧版本的 H2 默认情况下也对两个存储引擎使用单线程模式,在它们中显式启用多线程并不完全安全。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM