简体   繁体   中英

Read data from database and write into file using multithreading

I want to develop a program that reads data from the database and written into file. For a better performance, I want to use multithreading.

The solution I plan to implement is based on these assumptions:

  1. it is not necessary to put multiple threads to read from the database because there is a concurrency problem to be managed by the DBMS (similarly to the writing into the file). Given that each read element from the database will be deleted in the same transaction.
  2. Using the model producer-consumer: a thread to read the data (main program). and another thread to write the data in the file.
  3. For implementation I will use the executor framework: a thread pool (size=1) to represent the consumer thread.

Can these assumptions make a good solution ? Is this problem requires a solution based on multithreading?

it is not necessary to put multiple threads to read from the database because there is a concurrency problem to be managed by the DBMS

Ok. So you want one thread that is reading from the database.

Can these assumptions make a good solution ? Is this problem requires a solution based on multithreading?

Your solution will work but as mentioned by others, there are questions about the performance improvements (if any). Threading programs work because you can make use of the multiple processor (or core) hardware on your computer. In your case, if the threads are blocked by the database or blocked by the file-system, the performance improvement may be minimal if at all. If you were doing a lot of processing of the data, then having multiple threads handle the task would work well.

This is more of a comment:

For your first assumption: You should post the db part on https://dba.stackexchange.com/ .

A simple search returned : https://dba.stackexchange.com/questions/2918/about-single-threaded-versus-multithreaded-databases-performance - so you need to check if your read action is complex enough and if multithread even serves your need for db connection.

Also, your program seems to be sequential read and write. I dont think you even need multithreading unless you want multiple writes on the same file at the same time.

You should have a look at Spring Batch, http://projects.spring.io/spring-batch/ , which relates to JSR 352 specs.

This framework comes with pretty good patterns to manage ETL related operations, including multi-threaded processing, data partitioning, etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM