高效的 Java 集合分析来自具有数百万条记录的 CSV 文件的输入

Question

Lets say I have a csv file with stock exchange information in following format: timestamp, name, price, qty, account, buy/sell.假设我有一个 csv 文件，其中包含以下格式的证券交易所信息：时间戳、名称、价格、数量、帐户、买入/卖出。 This file may have millions of records and represents the trading activity for the day.该文件可能有数百万条记录，代表当天的交易活动。 The file is not sorted and I need to choose the most optimal Java collection for holding this data in order to provide analytics efficiently.该文件未排序，我需要选择最佳的 Java 集合来保存此数据，以便有效地提供分析。

Analytics Eg: 1)Most sold stock 2) Account with max transactions 3) Highest quantity of stock bought in a time range.分析例如：1) 卖出最多的股票 2) 交易最多的账户 3) 在一个时间范围内买入的最高数量的股票。 4) Top K people with highest transactions. 4）交易量最高的前K人。

Basically I will need to sort this list many times based on different fields.基本上，我需要根据不同的字段对这个列表进行多次排序。

So after a little bit of search I found that a Tree based collection is best for this use case.所以经过一番搜索后，我发现基于树的集合最适合这个用例。 Like a TreeMap.就像一个树图。 Is there any other collection which would be better?有没有其他更好的收藏？

Answer 1

TreeSet will be efficient if you want sorted by one parameter.如果您想按一个参数排序，TreeSet 将是有效的。 You can你可以

Create a class like:创建一个 class 像：

    public class Record {
        Calendar timeStamp;
        String name;
        double price;
        //...
    }

Create comparators for each task为每个任务创建比较器
Create a LinkedList (or other Collection)创建一个 LinkedList（或其他集合）

    List <Record> records=new LinkedList();

Use your comparators使用您的比较器

    records.sort(yourComparator1);
    records.sort(yourComparator2);
    records.sort(yourComparator3);

高效的 Java 集合分析来自具有数百万条记录的 CSV 文件的输入

问题描述

1 个解决方案

解决方案1
0 2021-01-20 16:06:49

高效的 Java 集合分析来自具有数百万条记录的 CSV 文件的输入

问题描述

1 个解决方案

解决方案1 0 2021-01-20 16:06:49

解决方案1
0 2021-01-20 16:06:49