[英]How to use CopyManager with connection pooled DataSource?
I'm trying to use postgres
CopyManager.copyIn()
for batch inserts. 我正在尝试将
postgres
CopyManager.copyIn()
用于批处理插入。 My datasource is a c3p0 ComboPooledDataSource
. 我的数据源是一个c3p0
ComboPooledDataSource
。
The sql statements are batch with: sql语句具有以下批处理功能:
dataSource.getConnection().getCopyAPI().copyIn(sql, items); //pseudocode
Now to speed up database inserts even more (hundred GB to import after pre-processing), I'm trying to send the copyIn
command in async threads. 现在,为了进一步提高数据库插入速度(预处理后要导入数百GB),我正在尝试在异步线程中发送
copyIn
命令。
But does this make sense if the database is located on a single disk filesystem? 但是,如果数据库位于单个磁盘文件系统上,这有意义吗? Would this gain performance?
这样会提高性能吗?
And how can I actually verify that the copyIn is using the connection pool in parallel? 以及如何实际验证copyIn是否正在并行使用连接池?
I tried VisualVM
MBeans
screen, where i can see a single PooledDataSource
entry. 我尝试了
VisualVM
MBeans
屏幕,在这里我可以看到一个PooledDataSource
条目。 But how can I know that the pool is used and items are send to DB parallel? 但是,我怎么知道该池已被使用并且项目已并行发送到数据库?
But does this make sense if the database is located on a single disk filesystem?
但是,如果数据库位于单个磁盘文件系统上,这有意义吗? Would this gain performance?
这样会提高性能吗?
If it's spinning rust it might not, and there's certainly little benefit in much concurrency. 如果它正在生锈,可能不会,并且在大量并发中肯定没有什么好处。 For SSDs it'll sometimes produce quite a significant improvement.
对于固态硬盘,有时会产生很大的改进。 Depends a lot on the drive.
在很大程度上取决于驱动器。
I tried VisualVM MBeans screen, where i can see a single PooledDataSource entry.
我尝试了VisualVM MBeans屏幕,在这里我可以看到一个PooledDataSource条目。 But how can I know that the pool is used and items are send to DB parallel?
但是,我怎么知道该池已被使用并且项目已并行发送到数据库?
Look at pg_stat_activity
in the database and see if there are multiple concurrent COPY
commands from the app. 查看数据库中的
pg_stat_activity
,看看应用程序中是否有多个并发的COPY
命令。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.