简体   繁体   中英

Concurrent use of same JDBC connection by multiple threads

I'm trying to better understand what will happen if multiple threads try to execute different sql queries, using the same JDBC connection, concurrently.

  • Will the outcome be functionally correct?

  • What are the performance implications?

  • Will thread A have to wait for thread B to be completely done with its query?

  • Or will thread A be able to send its query immediately after thread B has sent its query, after which the database will execute both queries in parallel?


I see that the Apache DBCP uses synchronization protocols to ensure that connections obtained from the pool are removed from the pool, and made unavailable, until they are closed. This seems more inconvenient than it needs to be. I'm thinking of building my own "pool" simply by creating a static list of open connections, and distributing them in a round-robin manner.

I don't mind the occasional performance degradation, and the convenience of not having to close the connection after every use seems very appealing. Is there any downside to me doing this?

I ran the following set of tests using a AWS RDS Postgres database, and Java 11:

  1. Create a table with 11M rows, each row containing a single TEXT column, populated with a random 100-char string

  2. Pick a random 5 character string, and search for partial-matches of this string, in the above table

  3. Time how long the above query takes to return results. In my case, it takes ~23 seconds. Because there are very few results returned, we can conclude that the majority of this 23 seconds is spent waiting for the DB to run the full-table-scan, and not in sending the request/response packets

  4. Run multiple queries in parallel (with different keywords), using different connections. In my case, I see that they all complete in ~23 seconds. Ie, the queries are being efficiently parallelized

  5. Run multiple queries on parallel threads, using the same connection. I now see that the first result comes back in ~23 seconds. The second result comes back in ~46 seconds. The third in ~1 minute. etc etc. All the results are functionally correct, in that they match the specific keyword queried by that thread

To add on to what Joni mentioned earlier, his conclusion matches the behavior I'm seeing on Postgres as well. It appears that all "correctness" is preserved, but all parallelism benefits are lost, if multiple queries are sent on the same connection at the same time.

Since the JDBC spec doesn't give guarantees of concurrent execution, this question can only be answered by testing the drivers you're interested in, or reading their source code.

In the case of MySQL Connector/J, all methods to execute statements lock the connection with a synchronized block. That is, if one thread is running a query, other threads using the connection will be blocked until it finishes.

Doing things the wrong way will have undefined results... if someone runs some tests, maybe they'll answer all your questions exactly, but then a new JVM comes out, or someone tries it on another jdbc driver or database version, or they hit a different set of race conditions, or tries another platform or JVM implementation, and another different undefined result happens.

If two threads modify the same state at the same time, anything could happen depending on the timing. Maybe the 2nd one overwrites the first's query, and then both run the same query. Maybe the library will detect your error and throw an exception. I don't know and wouldn't bother testing... (or maybe someone already knows or it should be obvious what would happen) so this isn't "the answer", but just some advice. Just use a connection pool, or use a synchronized block to ensure problems don't happen.

We had to disable the statement cache on Websphere, because it was throwing ArrayOutOfBoundsException at PreparedStatement level. The issue was that some guy though it was smart to share a connection with multiple threads. He said it was to save connections, but there is no point multithreading queries because the db won't run them parallel.

There was also an issue with a java runnables that were blocking each others because they used the same connection.

So that's just something to not do, there is nothing to gain.

There is an option in websphere to detect this multithreaded access. I implemented my own since we use jetty in developpement.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM