简体   繁体   中英

issue while loading data in cassandra using dsbulk

I'm facing issue while loading data into table from .csv file using dsbulk. I get like below in the errorlog.

Caused by: com.datastax.driver.core.exceptions.OperationTimedOutException: [/10.0.126.13:9042] Timed out waiting for server response

This environment is our POC environment of 3 nodes with 8 CPUs and 64G memory. And as per my observation when I run dsbulk command it eats up all the CPUs on the server and memory consumption goes high too.

If you can give me pointer to fine tune dsbulk by which cpu usage/memory consumption can be reduced. If this operation slows down and if I get manageable performance im ok with it.

You can specify the --executor.maxPerSecond option to limit the number of operations per second. See the documentation for DSBulk .

Also you can try to tune the batching options , like, --batch.maxBatchStatements .

And it's also recommended to run DSBulk from a separate machine to prevent it influence the DSE's performance. (that's common advice for all load testing, etc.)

感谢大家的帮助我能够通过下载最新版本的 debulk 并将批量大小设置为 5000 来解决此问题。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM