简体   繁体   English

具有LList打印输出的树/哈希图

[英]Tree/Hash map with LList printout

My program takes in an text file and stores every unique word (or grouping of characters) as the key in a map and also stores a linked list of the line numbers that each word appears on. 我的程序接收一个文本文件,并将每个唯一的单词(或字符分组)存储为地图中的键,并且还存储每个单词出现的行号的链接列表。 I also implemented an occurance counter in the printEntry method. 我还在printEntry方法中实现了一个发生计数器。

My problem is that I am trying to avoid printing the same line number twice if a single word appears more than once on a line. 我的问题是,如果单个单词在一行中出现多次,我将尝试避免两次打印相同的行号。 I have fooled around with the if statement in the printEntry method and seem to be getting close, but still no cigar. 我已经在printEntry方法中使用了if语句,但看起来似乎越来越接近了,但是仍然没有雪茄。 I do NOT want to block the duplicate line number from being added to the list because it still needs to be counted to increment the occurance variable. 我不想阻止重复的行号被添加到列表中,因为仍然需要对它进行计数以增加出现变量。

Here is an input that would cause me trouble: 这是会给我带来麻烦的输入:

keyboard
mouse mouse
mouse

I need the output to look like this: 我需要输出看起来像这样:

ID: keyboard  Line Numbers: 1  Occurance: 1
ID: mouse  Line Numbers: 2,3  Occurance 3

I will only provide the printEntry method for now to keep the post short. 我现在只提供printEntry方法,以使文章简短。 If needed, I can provide further code. 如果需要,我可以提供更多代码。 Thanks. 谢谢。

public static void printEntry(Map.Entry entry){

    //local occurance variable
    int occurance = 1;

    //print the word and the line numbers as well as test for duplicate line integers on the same key
    Iterator itr = ((LinkedList) entry.getValue()).iterator();
    System.out.print("ID: " + entry.getKey() + "   Lines: " + itr.next());

    //object variable to store previous line number
    Object check = itr.next();
    while(itr.hasNext()){
        occurance++;
        if (check != itr.next()){
            System.out.print(", " + itr.next());
        }
        else {
            System.out.println("Skipped duplicate");
        }
    }
    //prints occurance from incremented occurance variable
    System.out.print("  " + " Occurance: " + occurance);
    System.out.println();
}

Edit- 编辑-

I would like all of an entry's information to appear all on the same line as we are going to be scanning large(r) documents. 我希望所有条目的信息都出现在同一行上,因为我们将要扫描大型文档。 I have formatted the printEntry method close to where I would like it, but cannot figure out how to do it with the for loop. 我已经将printEntry方法设置为接近我想要的位置的格式,但是无法弄清楚如何使用for循环进行格式化。

        public void printEntry(Map.Entry<String, WordStats> entry) {
    String word = entry.getKey();
    WordStats stats = entry.getValue();

    System.out.print("ID: " + word + "  Occurrences: " 
                       + stats.getOccurrences() + " Lines: ");
    for (Integer lineNumber : stats.getLines()) {
        System.out.println(lineNumber);
    }
}

So you want, for each word, to keep 因此,您希望每个字都保持

  • the number of times it appears 出现的次数
  • a sorted set of line numbers where it appears (and by set, I mean no duplicate line numbers) 出现的一组行号排序(按组,我的意思是没有重复的行号)

So just do that: 因此,只需执行以下操作:

public class WordStats {
    private int occurrences;
    private SortedSet<Integer> lineNumbers = new TreeSet<Integer>();

    public void addOccurrence(int lineNumber) {
        occurrences++; 
        lineNumbers.add(lineNumber);
    }

    // getters ommitted for brevity
}

And now just use a Map<String, WordStats> . 现在只需使用Map<String, WordStats> For each word in the text, add a WordStats if it isn't in the map yet, and add an occurrence to its WordStats instance. 对于文本中的每个单词,如果它还没有在地图中,则添加一个WordStats,并在其WordStats实例中添加一个实例。

The printEntry method would then look like: 然后,printEntry方法将如下所示:

public void printEntry(Map.Entry<String, WordStats> entry) {
    String word = entry.getKey();
    WordStats stats = entry.getValue();
    System.out.println("The word " + word + " has been met " 
                       + stats.getOccurrences() + " time(s), on the following line(s):");
    for (Integer lineNumber : stats.getLines()) {
        System.out.println(lineNumber);
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM