简体   繁体   English

按每个单词查找文本文件的行号

[英]Find the line number of a text file by each word

I want to find the line number of a text file by each word, however, the method I wrote below only gives the first number while I need a list of line numbers.我想通过每个单词找到文本文件的行号,但是,我在下面编写的方法只给出了第一个数字,而我需要一个行号列表。

For instance, if "a" occurs in lines: 1,3,5, it should have a list of [1,3,5].例如,如果“a”出现在第 1,3,5 行中,它应该有一个 [1,3,5] 的列表。 This list result then will be passed into another method for further process.这个列表结果然后将被传递到另一个方法进行进一步处理。 But, my result only shows [1] for "a".但是,我的结果只显示 [1] 为“a”。

Can someone help me fix this?有人可以帮我解决这个问题吗? Thank you!谢谢!

    public SomeObject<Word> buildIndex(String fileName, Comparator<Word> comparator) {
        SomeObject<Word> someObject = new SomeObject<>(comparator);

        Comparator<Word> comp = checkComparator(someObject.comparator());
        int num = 0;
        if (fileName != null) {
            File file = new File(fileName);
            try (Scanner scanner = new Scanner(file, "latin1")) {
                while (scanner.hasNextLine()) {
                    String lines;
                    if (comparator instanceof IgnoreCase) {
                        lines = scanner.nextLine().toLowerCase();
                    } else {
                        lines = scanner.nextLine();
                    }
                    if (lines != null) {
                        String[] lineFromText = lines.split("\n");

                        List<Integer> list = new ArrayList<>();
                        for (int i = 0; i < lineFromText.length; i++) {
                            String[] wordsFromText = lineFromText[i].split("\\W");
                            num++;

                            for (String s : wordsFromText) {

                                if (s != null && lineFromText[i].contains(s)) {
                                    list.add(num);
                                }

                                if (s != null && !s.trim().isEmpty() && s.matches("^[a-zA-Z]*$")) {
                                    doInsert(s, comp, someObject, list);
                                }
                            }


                        }

                    }
                }
            } catch (FileNotFoundException e) {
                e.printStackTrace();
            }
        }
        return someObject;
    }

Does something like this work for you?像这样的事情对你有用吗?

  1. It reads in the lines one at a time.它一次读取一行。
  2. Finds the words by splitting on spaces .通过在spaces拆分来查找单词。
  3. Then puts the words and the line numbers in a map where the key is the word an the value is a list of line numbers.然后将单词和行号放在map ,其中键是单词,值是行号列表。
      int lineCount = 1;
      String fileName = "SomeFileName";
      Map<String, List<Integer>> index = new HashMap<>();
      Scanner scanner = new Scanner("fileName");

      while (scanner.hasNextLine()) {
         //get single line from file
         String line = scanner.nextLine().toLowerCase();
         //split into words
         for (String word : line.split("\\s+")) {
             // add to lineNumber to map if List already there.
             // otherwise add new List and then add lineNumber  
             index.compute(word,
                   (wd, list) -> list == null ? new ArrayList<>()
                        : list).add(lineCount);
         }
         // bump lineCount for next line
         lineCount++;
      }

Print them out.打印出来。

      index.forEach((k, v) -> System.out.println(k + " --> " + v));

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何找到一个单词在文本文件的哪一行中,以及该单词是否存在于多行中,保存行号? - How to find a word is in which line of a text file and if the word exists in multiple line save the line number? 获取单词在文本文件中的位置(行号,行中的位置) - Get position of a word in a text file (line number, position in line) 写在文本文件上的文本或字符串与前一个单词重叠。 它应该逐行写入每个字 - Text or String written on text file overlaps the previous word. It should write each word line by line 将文本文件每行的第一个单词读入ArrayList - Read the first word of each line of a text file into an ArrayList 使用java查找文本文件中单词的行号 - Finding line number of a word in a text file using java 在文本文件中查找特定单词,然后打印整行 - Find particular word in text file and then print the entire line Java:如何从文本文件的每一行中提取第一个单词并将其写入新的文本文件? - Java: How to take the first word in each line from a text file and write it to a new text file? 从 java 的文本文件中查找每一行中的最大数字 - Find biggest number in each row from a text file in java 从文本文件中读取每行的第一个单词,然后填充到组合框 - Read the first word of each line from a text file and then populate to Combo box 从文件中读取文本并将每一行中的每个单词存储到单独的变量中 - Reading text from file and storing each word from every line into separate variables
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM