從文本文件創建的字典-contains（）始終返回false

Question

我目前正忙於處理一所小型大學，並且在我實現的字典類的contains（）方法上遇到了一些麻煩-該方法始終返回false。 該類如下所示：

public class LocalDictionary {
    private ArrayList<String> wordsSet;

    public LocalDictionary() throws IOException {
        String wordListContents = new String(Files.readAllBytes(Paths.get("words.txt")));

        wordsSet = new ArrayList<>();
        String[] words = wordListContents.split("\n");
        for (int i = 0; i < words.length; i++) {
            wordsSet.add(words[i].toLowerCase());
        }
    }

    public boolean contains(String word) {
        return wordsSet.contains(word.toLowerCase());
    }
}

字典從中獲取單詞的“ words.txt”文件位於https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt，但這是其外觀的摘錄：

zinked
zinkenite
zinky
zinkiferous
zinkify
zinkified
zinkifies
zinkifying
zinnia
zinnias
zinnwaldite
zinober
zinsang
zinzar
zinziberaceae

我已經確保“ words.txt”中包含的單詞包含在“ wordsSet”中，但無法弄清楚為什么contains方法對出現在ArrayList中的單詞返回false。

非常感謝您的幫助。

Answer 1

在添加前修剪for循環中的每一行。 該行中每個單詞后面似乎都有一些多余的空間。

for (int i = 0; i < words.length; i++) {
    wordsSet.add(words[i].toLowerCase());
}

至

for (int i = 0; i < words.length; i++) {
    wordsSet.add(words[i].trim().toLowerCase());
}

可以使用wordsSet.get(1).length()進行驗證。 根據您的文件，第一行是'aa'，但是它打印的是3而不是2，這是因為每個單詞后面都有一個多余的空間，需要在添加到列表之前進行修剪。

您的contains()方法沒有問題。

Answer 2

嘗試使用BufferedReader ，我嘗試並為我工作（我刪除了一些無用的行）。 在您使用時，您正在從文件讀取所有字節，將有多余的字節。

public class LocalDictionary {
    private ArrayList<String> wordsSet = new ArrayList<>();

    public LocalDictionary() throws Exception {

        //dont forget to absolute path to here. click righ click to file and copy path
        File file = new File("C:\\Users\\higuys\\IdeaProjects\\try\\src\\words.txt");
        BufferedReader br = new BufferedReader(new FileReader(file));

        String line;
        while ((line = br.readLine()) != null)
            //trim and tolowercase and add to list.
            wordsSet.add(line.trim().toLowerCase());

    }

    public boolean contains(String word) {
        return wordsSet.contains(word.toLowerCase());
    }
}

Answer 3

您的問題似乎是與操作系統有關的分線器處理不當，在這里，

String[] words = wordListContents.split("\n");

在字典的字符串上保留多余的字符。 並非所有操作系統都使用“ \\ n”來分隔行，因此您應該編寫代碼以考慮到這一點。

一種選擇是讓Java告訴您要使用的行分隔符，然后使用它：

String lineSeparator = System.getProperty("line.separator");
String[] words = wordListContents.split(lineSeparator);

在我看來，最簡單的方法是使用“文件”獲取所有行，例如：

private List<String> wordsSet1;
private ArrayList<String> wordsSet2;

public TestDictionary(String path) throws IOException {
    // my code:
    wordsSet1 = Files.readAllLines(Paths.get(path));

通過使用文件readAllLines，您可以讓Java選擇正確的行分隔符。

將您的代碼與我的代碼進行比較：

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;

import save.MyColorFoo;

public class TestDictionary {
    // public static final String TXT_PATH = "src/pkg1/words.txt";
    // TODO: change this to your correct path
    public static final String TXT_PATH = "words.txt";
    private List<String> wordsSet1;
    private ArrayList<String> wordsSet2;

    public TestDictionary(String path) throws IOException {
        // my code:
        wordsSet1 = Files.readAllLines(Paths.get(path));

        // his code
        String wordListContents = new String(Files.readAllBytes(Paths.get(path)));

        wordsSet2 = new ArrayList<>();
        String[] words = wordListContents.split("\n");
        for (int i = 0; i < words.length; i++) {
            wordsSet2.add(words[i].toLowerCase());
        }

    }

    public boolean myContains(String word) {
        return wordsSet1.contains(word.toLowerCase());
    }

    public boolean hisContains(String word) {
        return wordsSet2.contains(word.toLowerCase());
    }

    public static void main(String[] args) {
        try {
            TestDictionary testDictionary = new TestDictionary(TXT_PATH);

            String testWord = "zinky";
            System.out.println("My List contains \"zinky\":  " + testDictionary.myContains(testWord));
            System.out.println("His List contains \"zinky\": " + testDictionary.hisContains(testWord));

        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

如果不確定原始文本文件是否包含所有小寫字母，然后需要降低它們，則可以使用Streams來幫助您做到這一點：

wordsSet1 = Files.readAllLines(Paths.get(path))
        .stream().map(s -> s.toLowerCase())
        .collect(Collectors.toList());

從文本文件創建的字典-contains（）始終返回false

問題描述

3 個解決方案

解決方案1
0 已采納 2018-10-06 13:46:45

解決方案2
0 2018-10-06 14:10:00

解決方案3
0 2018-10-06 14:30:41

從文本文件創建的字典-contains（）始終返回false

問題描述

3 個解決方案

解決方案1 0 已采納 2018-10-06 13:46:45

解決方案2 0 2018-10-06 14:10:00

解決方案3 0 2018-10-06 14:30:41

解決方案1
0 已采納 2018-10-06 13:46:45

解決方案2
0 2018-10-06 14:10:00

解決方案3
0 2018-10-06 14:30:41