简体   繁体   English

如何比较文本文件的每一行? java的

[英]How to compare each line of a text File? java

I have a text file with content like, with 792 lines: 我有一个内容类似的文本文件,有792行:

der 17788648
und 14355959
die 10939606
Die 10480597

Now I want to compare if "Die" and "die" are equal in lowercase. 现在我要比较“Die”和“die”是否等于小写。 So if two Strings in lowerCase are equal, copy the word into a new text file in lowerCase and sum the values. 因此,如果lowerCase中的两个字符串相等,则将该单词复制到lowerCase中的新文本文件中并对这些值求和。

Expected output: 预期产量:

der 17788648
und 14355959
die 114420203

I have that so far: 到目前为止我有这个:

    try {
        BufferedReader bk = null;
        BufferedWriter bw = null;

        bk = new BufferedReader(new FileReader("outagain.txt"));
        bw = new BufferedWriter(new FileWriter("outagain5.txt"));

        List<String> list = new ArrayList<>();
        String s = "";
        while (s != null) {
            s = bk.readLine();
            list.add(s);
        }


        for (int k = 0; k < 793; k++) {
            String u = bk.readLine();
            if (list.contains(u.toLowerCase())) {

                //sum values?

            } else {
                bw.write(u + "\n");
            }
        }

        System.out.println(list.size());

    } catch (Exception e) {
        System.out.println("Exception caught : " + e);
    }

Instead of list.add(s); 而不是list.add(s); , use list.add(s.toLowerCase()); ,使用list.add(s.toLowerCase()); . Right now your code is comparing lines of indeterminate case to lower-cased lines. 现在,您的代码正在将不确定案例的行与较低行的行进行比较。

With Java 8, the best approach to standard problems like reading files, comparing, grouping, collecting is to use the streams api, since it is much more concise to do that in that way. 使用Java 8,读取文件,比较,分组和收集等标准问题的最佳方法是使用流API,因为以这种方式执行此操作要简单得多。 At least when the files is only a few KB, then there will be no problems with that. 至少当文件只有几KB时,就没有问题了。 Something like: 就像是:

Map<String, Integer> nameSumMap = Files.lines(Paths.get("test.txt"))
            .map(x -> x.split(" "))
            .collect(Collectors.groupingBy(x -> x[0].toLowerCase(),
                    Collectors.summingInt(x -> Integer.parseInt(x[1]))
            ));

First, you can read the file with Files.lines() , which returns a Stream<String> , than you can split the strings into a Stream<String[]> , finally you can use the groupingBy() and summingInt() functions to group by the first element of the array and sum by the second one. 首先,您可以使用Files.lines()读取文件,该文件返回Stream<String> ,而不是将字符串拆分为Stream<String[]> ,最后您可以使用groupingBy()summingInt()函数按数组的第一个元素分组,按第二个元素求和。

If you don't want to use the stream API, you can also create a HashMap und do your summing manually in the loop. 如果您不想使用流API,您还可以创建HashMap并在循环中手动进行求和。

The String class has an equalIgnoreCase method which you can use to compare two strings irrespective of case. String类有一个equalIgnoreCase方法,您可以使用它来比较两个字符串,无论大小写如何。 so: 所以:

String var1 = "Die";
String var2 = "die";

System.out.println(var1.equalsIgnoreCase(var2));

Would print TRUE. 将打印为TRUE。

If I got your question right, you want to know how you can get the prefix from the file, compare it, get the value behind it and sum them up for each prefix. 如果我的问题是正确的,你想知道如何从文件中获取前缀,比较它,获取它后面的值并为每个前缀求它们。 Is that about right? 那是对的吗?

You could use regular expressions to get the prefixes and values seperately. 您可以使用正则表达式单独获取前缀和值。 Then you can sum up all values with the same prefix and write them to the file for each one. 然后,您可以使用相同的前缀汇总所有值,并将它们写入每个文件的文件中。

If you are not familiar with regular expressions, this links could help you: 如果您不熟悉正则表达式,此链接可以帮助您:

Regex on tutorialpoint.com 在tutorialpoint.com上的正则表达式

Regex on vogella.com vogella.com上的正则表达式

For additional tutorials just scan google for "java regex" or similar tags. 对于其他教程,只需扫描谷歌“java regex”或类似的标签。

If you do not want to differ between upper- and lowercase strings, just convert them all to lower/upper before comparing them as @spork explained already. 如果你不想在大写和小写字符串之间有所不同,只需将它们全部转换为低/高,然后再比较它们@spork已经解释过。

Use a HashMap to keep track of the unique fields. 使用HashMap跟踪唯一字段。 Before you do a put, do a get to see if the value is already there. 在你做一个看跌期权之前,先看看价值是否已存在。 If it is, sum the old value with the new one and put it in again (this replaces the old line having same key) 如果是,则将旧值与新值相加并再次将其放入(这将替换具有相同键的旧行)

package com.foundations.framework.concurrency;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;

public class FileSummarizer {

  public static void main(String[] args) {
    HashMap<String, Long> rows = new HashMap<String, Long>();
    String line = "";
    BufferedReader reader = null;
    try {
      reader = new BufferedReader(new FileReader("data.txt"));
      while ((line = reader.readLine()) != null) {
        String[] tokens = line.split(" ");
        String key = tokens[0].toLowerCase();
        Long current = Long.parseLong(tokens[1]);

        Long previous = rows.get(key);
        if(previous != null){
          current += previous;
        }
        rows.put(key, current);
      }
    }
    catch (IOException e) {
      e.printStackTrace();
    }
    finally {
      try {
        reader.close();
        Iterator<String> iterator = rows.keySet().iterator();
        while (iterator.hasNext()) {
          String key = iterator.next().toString();
          String value = rows.get(key).toString();

          System.out.println(key + " " + value);
        }
      }
      catch (IOException e) {
        e.printStackTrace();
      }
    }
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM