简体   繁体   English

Java Executor Service用于并行处理

[英]Java Executor Service for parallel processing

Working on a system to support multiple database queries in parallel. 在支持并行多个数据库查询的系统上工作。 Considering there is a lot of data to query from each, the requirement is to keep each database query separated from others. 考虑到每个数据库都有很多数据要查询,因此要求将每个数据库查询与其他数据库分开。 Meaning, load on one database/table should not have impact on other Table queries. 这意味着,一个数据库/表上的负载不应影响其他表查询。 I developed a solution in Java using ExecutorService. 我使用ExecutorService开发了Java解决方案。 Using one ExecutorService(Fixed size with 1 Thread) per Database. 每个数据库使用一个ExecutorService(具有1个线程的固定大小)。 I maintain a map of DB name TO ExecutorService and direct the calls to respective executor service on receiving query requests. 我维护一个数据库名称TO ExecutorService的映射,并在接收查询请求时将调用定向到相应的执行程序服务。 Considering there can be one hundred databases being queried in parallel, not sure if ExecutorService is the right choice...! 考虑到可以并行查询一百个数据库,不确定ExecutorService是否是正确的选择...! I have done some valuation and initial results look okay. 我已经进行了一些评估,初步结果还不错。 One challenge I have with this solution is, as I am creating ExecutorServices dynamically, it's getting tough for me to shutdown them gracefully when application stops. 该解决方案面临的一个挑战是,当我动态创建ExecutorServices时,要在应用程序停止时优雅地关闭它们变得越来越困难。

Other ways to tackle this problem is to maintain a global(meaning, across all Databases) pool of query worker threads, and reuse them in random for incoming requests. 解决此问题的其他方法是维护查询工作线程的全局(在所有数据库中)池,并将其随机用于传入请求。 But, this will not guarantee all Database queries are given equal priority. 但是,这不能保证所有数据库查询都具有相同的优先级。

DatasetFactory.java 数据集工厂

public class DataSetExecutorFactory {

        private static Map<String, DataSetExecutor> executorMap = Collections.synchronizedMap(new HashMap<String, DataSetExecutor>());
    public static DataSetExecutor getDataSetExecutor(String dbName){
            DataSetExecutor executor = null;

            executor = executorMap.get(dbName);
            if(executor == null){
                executor = new DataSetExecutor(dbName);
                executorMap.put(dbName, executor);
            }
            return executor;
        }
    }
}

DataSetExecutor.java DataSetExecutor.java

public class DataSetExecutor {

    private ExecutorService executor = Executors.newFixedThreadPool(1);
    public List<Map<String, Object>> execQuery(String collecName, Map<String, Object> queryParams){
        //Construct Query job. 
        //QueryWorker extends 'Callable' and does the actual query to DB
        QueryWorker queryWorker = new QueryWorker(Map queryParams);

        Future<QueryResult> result = null;
        try{
            result = executor.submit(queryWorker);
        }catch (Exception e){
            //Catch Exception here
            e.printStackTrace();
        }
    }

I think your misunderstanding how ExecutorService Works. 我认为您对ExecutorService的工作方式有误解。 Rather than creating an ExecutorService for each Database, You should make a single ExecutorService as a FixedThreadPool of size n (n = # of databases or # of max parallel queries). 而不是为每个数据库创建一个ExecutorService,您应该将单个ExecutorService作为大小为n(n =数据库数或最大并行查询数)的FixedThreadPool。 The Thread pool will do the parallel processing work for you. 线程池将为您执行并行处理工作。 You simply need to track the database name as part of your QueryWorker that will be submitted to the ExecutorService. 您只需要跟踪数据库名称,并将其作为将提交到ExecutorService的QueryWorker的一部分。

This also makes shutdown easy as the ThreadPool will automatically clean up unused threads and you only need to shut it down once when the application closes. 这也使关闭变得容易,因为ThreadPool将自动清理未使用的线程,并且您只需要在应用程序关闭时将其关闭一次。

All that being said though, Since all this parallel processing is happening in the same JVM and on the same Machine, You might run into Memory or CPU limitations depending on how intense your querying is. 尽管如此,由于所有并行处理都是在同一JVM和同一台计算机上进行的,因此您可能会遇到内存或CPU限制,这取决于查询的强度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM