简体   繁体   中英

Reading from .txt and storing into Hashmap

Can I use any other approach to read the semicolon separated string from .txt file into Has map instead of * sourceArra *y

 public static void main(String[] args) throws IOException {
    try {
        ArrayList<Synset> booleansynsets = null;
        ArrayList<Synset> booleanduplicatesynsets = null;
        Map<String, String> basebooleanentitieslist = new HashMap<String, String>();
        BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\\Users\\anand\\Desktop\\updatedDuplicateBooleanEntitiesList-sorted.txt"));
        String line = "";
        while ((line = bufferedReader.readLine()) != null) {
            String[] sourceArray = line.split(";");
            basebooleanentitieslist.put(sourceArray[0],sourceArray[1]);
            System.out.println(line);
        }

// the updated one

bufferedReader.toString();
        StringTokenizer st1 = new StringTokenizer(bufferedReader.toString(),";");
        while ((line = bufferedReader.readLine()) != null && st1.hasMoreTokens()) {
  //                String[] sourceArray = line.split(";");

            basebooleanentitieslist.put(st1.nextToken(";"), st1.nextToken());
            System.out.println(line);
        }

考虑使用StringTokenizer类。

您可以使用StringTokenizer (或)拆分。

no problem with the current approach, but was feeling whether its lengthy, i mean is there a way to optimize it without using sourceArray.

You don't say what you are trying to optimize for: performance? memory usage? readability?

If you are concerned about performance, the next question is whether your concern is actually justified. Have you run your application? Is it too slow? Have you profiled it and determined that splitting the lines is taking a significant amount of time?

What specifically is wrong with using an array? (Yes, I know that allocating an array costs something, but have you any evidence that this is significant?)


If you are trying to optimize for readability, then I'd say that using String.split is probably more readable for this example. (Many Java programmers have never come across / used the StringTokenizer class.)

If you are trying to optimize for performance / memory usage, then StringTokenizer is worth trying, but I wouldn't guarantee it is faster. Another alternative is to use Pattern and Matcher directly as follows:

    Pattern pattern = Pattern.compile("([^;]*);(.*)");
    while ((line = bufferedReader.readLine()) != null) {
        Matcher matcher = pattern.matcher(line)
        if (matcher.matches()) {
            basebooleanentitieslist.put(matcher.group(1), matcher.group(2));
        }
    }

(By the way, the code about will handle the case where the line doesn't split gracefully; ie without throwing an exception. If you want to deal with it explicitly, add an else clause.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM