简体   繁体   English

从.txt读取并存储到Hashmap中

[英]Reading from .txt and storing into Hashmap

Can I use any other approach to read the semicolon separated string from .txt file into Has map instead of * sourceArra *y 我可以使用任何其他方法将.txt文件中的分号分隔的字符串读入Has映射而不是* sourceArra * y

 public static void main(String[] args) throws IOException {
    try {
        ArrayList<Synset> booleansynsets = null;
        ArrayList<Synset> booleanduplicatesynsets = null;
        Map<String, String> basebooleanentitieslist = new HashMap<String, String>();
        BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\\Users\\anand\\Desktop\\updatedDuplicateBooleanEntitiesList-sorted.txt"));
        String line = "";
        while ((line = bufferedReader.readLine()) != null) {
            String[] sourceArray = line.split(";");
            basebooleanentitieslist.put(sourceArray[0],sourceArray[1]);
            System.out.println(line);
        }

// the updated one //更新的

bufferedReader.toString();
        StringTokenizer st1 = new StringTokenizer(bufferedReader.toString(),";");
        while ((line = bufferedReader.readLine()) != null && st1.hasMoreTokens()) {
  //                String[] sourceArray = line.split(";");

            basebooleanentitieslist.put(st1.nextToken(";"), st1.nextToken());
            System.out.println(line);
        }

考虑使用StringTokenizer类。

您可以使用StringTokenizer (或)拆分。

no problem with the current approach, but was feeling whether its lengthy, i mean is there a way to optimize it without using sourceArray. 当前的方法没问题,但是感觉它是否冗长,我的意思是有一种无需使用sourceArray即可对其进行优化的方法。

You don't say what you are trying to optimize for: performance? 您没有说要针对什么进行优化:性能? memory usage? 内存使用情况? readability? 可读性?

If you are concerned about performance, the next question is whether your concern is actually justified. 如果您担心性能,那么下一个问题是您的担心是否合理。 Have you run your application? 您运行应用程序了吗? Is it too slow? 太慢了吗? Have you profiled it and determined that splitting the lines is taking a significant amount of time? 您是否已对它进行了概要分析并确定拆分行需要花费大量时间?

What specifically is wrong with using an array? 使用数组具体有什么问题? (Yes, I know that allocating an array costs something, but have you any evidence that this is significant?) (是的,我知道分配一个数组会花费一些钱,但是您是否有证据表明这很重要?)


If you are trying to optimize for readability, then I'd say that using String.split is probably more readable for this example. 如果您想针对可读性进行优化,那么我想说的是,对于此示例,使用String.split可能更具可读性。 (Many Java programmers have never come across / used the StringTokenizer class.) (许多Java程序员从未遇到过/使用过StringTokenizer类。)

If you are trying to optimize for performance / memory usage, then StringTokenizer is worth trying, but I wouldn't guarantee it is faster. 如果您要针对性能/内存使用进行优化,那么StringTokenizer是值得尝试的,但我不能保证它会更快。 Another alternative is to use Pattern and Matcher directly as follows: 另一种选择是直接使用PatternMatcher ,如下所示:

    Pattern pattern = Pattern.compile("([^;]*);(.*)");
    while ((line = bufferedReader.readLine()) != null) {
        Matcher matcher = pattern.matcher(line)
        if (matcher.matches()) {
            basebooleanentitieslist.put(matcher.group(1), matcher.group(2));
        }
    }

(By the way, the code about will handle the case where the line doesn't split gracefully; ie without throwing an exception. If you want to deal with it explicitly, add an else clause.) (顺便说一句,about的代码将处理行无法正常分割的情况;即,不会引发异常。如果要显式处理该行,请添加else子句。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM