简体   繁体   English

使用数组查找Java中2个文本文件中包含的唯一单词的列表

[英]find the list of unique words contained in 2 text files in java using arrays

I need to read two text files and display all the unique words in both the text files.(the words in both 2 files can only be printed once) 我需要阅读两个文本文件并在两个文本文件中显示所有唯一的单词(两个文件中的单词只能打印一次)

file1.txt FILE1.TXT

lion 狮子

tiger

cheetah 猎豹

elephant

cow

file2.txt FILE2.TXT

mouse 老鼠

dog

cow

cat

lion 狮子

expected output : 预期产量:

lion tiger cheetah elephant cow dog cat mouse 狮子虎猎豹大象牛狗猫老鼠

public class Workshop {

static int count1 = 0;
static int count2 = 0;

private static final String FILE1 = "C:\\Users\\shagi\\Desktop\\file1.txt";
private static final String FILE2 = "C:\\Users\\shagi\\Desktop\\file2.txt";

static String arrayLines1[] = new String[countLines(FILE1)];
static String arrayLines2[] = new String[countLines(FILE2)];
static String totalArray[] = new String[arrayLines1.length + arrayLines2.length];
static String arrayLines1new[]=new String[countLines(FILE1)];
static int flag = 0;
static int k=arrayLines1.length;

public static void main(String[] args) throws IOException {
    readFile(FILE1, FILE2);
    displaySimilar();
    displayAll();
}

public static int countLines(String File) {
    int lineCount = 0;
    try {
        BufferedReader br = new BufferedReader(new FileReader(File));
        while ((br.readLine()) != null) {
            lineCount++;
        }
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
    return lineCount;
}

public static void readFile(String File1, String File2) {
    String contents1 = null;
    String contents2 = null;
    try {
        FileReader file1 = new FileReader(File1);
        FileReader file2 = new FileReader(File2);
        BufferedReader buf1 = new BufferedReader(file1);
        BufferedReader buf2 = new BufferedReader(file2);
        while ((contents1 = buf1.readLine()) != null) {
            arrayLines1[count1] = contents1;
            count1++;
        }
        while ((contents2 = buf2.readLine()) != null) {
            arrayLines2[count2] = contents2;
            count2++;
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
}

There are two methods which i tried to find the ans for my question Method 1 我尝试通过两种方法找到问题的答案方法1

public static void displayAll() {
    for (int i =0; i<k-1;i++){
        System.out.println(totalArray[i]);
    }

    System.out.println(totalArray[k-1]);
    System.out.println("");
    int p=0;
    for (int i=0;i<arrayLines2.length;i++){
        for (int j=0;j<arrayLines1.length;j++){
            if (arrayLines2[i].equals(arrayLines1[j])){
                flag=1;
                break;
            } else {
                flag=0;
            }
            if (flag==1){
                arrayLines1new[p]=arrayLines2[i];
                p++;
            }
        }
    }

Method 2 方法二

 public static void displayAll() {
    for (int i=0;i<arrayLines1.length;i++){
        String a=arrayLines1[i];
        for (int j=0;j<arrayLines2.length;j++){
            String b =arrayLines2[j];
            if (!a.equals(b)){
                System.out.println(a);
            }
        }
    }
 }

But both doesnt give the expected output 但是两者都没有给出预期的输出

This would be a good situation for a HashMap. 对于HashMap,这将是一个好情况。 The keys would be the words and the values would be the number of occurrences. 键将是单词,值将是出现的次数。 You could then print out the keys with a value of 1. The pseudo code would look like this: 然后,您可以打印出值为1的键。伪代码如下所示:

  1. Initialize the map: HashMap <String, Integer> wordMap = new HashMap<>(); 初始化地图: HashMap <String, Integer> wordMap = new HashMap<>();
  2. For each file: 对于每个文件:
  3. -- For each word: -对于每个单词:
  4. ---- Put the word in wordMap with the appropriate value. ----将单词以适当的值放在wordMap
  5. For each key in wordMap : 对于wordMap每个键:
  6. -- If wordMap.get(key) == 1 , print out the key -如果wordMap.get(key) == 1 ,则打印出密钥

You could also accomplish the same thing using two arrayLists, using one to keep track out the words and another to keep track of the counts. 您还可以使用两个arrayLists完成相同的操作,一个用于跟踪单词,另一个用于跟踪计数。

Both methods have an O(N) time complexity, but using the map is more performant because the maps's values can be updated in O(1). 两种方法都具有O(N)的时间复杂度,但是使用地图的性能更高,因为可以在O(1)中更新地图的值。

There is lot of redundant code. 有很多冗余代码。 Here is a simpler and shorter version. 这是一个更简单,更短的版本。

I am using Set and its operations to find common (intersection), uncommon and all unique words. 我正在使用Set及其操作来查找常见(相交),不常见和所有唯一的单词。

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashSet;
import java.util.Set;

public class Workshop {

    private static final String FILE1 = "C:\\Users\\shagi\\Desktop\\file1.txt";
    private static final String FILE2 = "C:\\Users\\shagi\\Desktop\\file2.txt";

    static Set<String> file1Words = new HashSet<String>();
    static Set<String> file2Words = new HashSet<String>();
    static Set<String> allWords = new HashSet<String>();
    static Set<String> commonWords = new HashSet<String>();
   static Set<String> uncommonWords = new HashSet<String>();

    public static void main(String[] args) throws IOException {
        file1Words.addAll(readFile(FILE1));
        file2Words.addAll(readFile(FILE2));
        System.out.println("file1  : " + file1Words);
        System.out.println("file2  : " + file2Words);
        displaySimilar();
        System.out.println("common : " + commonWords);
        displayAll();
        System.out.println("all    : " + allWords);
         displayUnCommon();
        System.out.println("uncommon : " + uncommonWords);
    }

    public static void displaySimilar() {
        commonWords.addAll(file1Words);
        commonWords.retainAll(file2Words);
    }

    public static void displayUnCommon() {
         uncommonWords.addAll(file1Words);
        uncommonWords.addAll(file2Words);
        uncommonWords.removeAll(commonWords);
    }

   public static Set<String> readFile(String file) {
        Set<String> words = new HashSet<String>();
        try {
            FileReader fileReader = new FileReader(file);
            BufferedReader buffer = new BufferedReader(fileReader);
            String content = null;
            while ((content = buffer.readLine()) != null) {
                words.add(content);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
        return words;
    }

    public static void displayAll() {
        allWords.addAll(file1Words);
        allWords.addAll(file2Words);
    }
}

Sample Run: 样品运行:

file1  : [lion, cheetah, tiger, elephant, cow]
file2  : [lion, mouse, cat, cow, dog]
common : [lion, cow]
all    : [cheetah, lion, cat, mouse, tiger, elephant, cow, dog]
uncommon : [cheetah, cat, mouse, tiger, elephant, dog]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM