我似乎无法弄清楚如何打印所有包括重复的单词

Question

我试图让它以升序打印出文本文件中的所有单词。 当我运行它时，它以升序打印出来，但它只打印出一个单词。 我希望它打印出每个单词的出现（需要重复）。 我不确定我做错了什么。 此外，我希望它只打印出文字文件中的单词而不是标点符号。 我知道我需要使用“拆分”，只是不确定如何正确使用它。 我之前曾经使用它，但不记得如何在这里应用它。

这是我到目前为止的代码：

public class DisplayingWords {

public static void main(String[] args) throws 
        FileNotFoundException, IOException 
{
    Scanner ci = new Scanner(System.in);
    System.out.print("Please enter a text file to open: ");
    String filename = ci.next();
    System.out.println("");

    File file = new File(filename);
    BufferedReader br = new BufferedReader(new FileReader(file));

    StringBuilder sb = new StringBuilder();
    String str;
    while((str = br.readLine())!= null)

    {
/*
 * This is where i seem to be having my problems.
 * I have only ever used a split once before and can not 
 * remember how to properly use it. 
 * i am trying to get the print out to avoid printing out 
 * all the punctuation marks and have only the words
 */

      //  String[] str = str.split("[ \n\t\r.,;:!?(){}]");
        str.split("[ \n\t\r.,;:!?(){}]");
        sb.append(str);
        sb.append(" ");
        System.out.println(str);
    }

    ArrayList<String> text = new ArrayList<>();
    StringTokenizer st = new StringTokenizer(sb.toString().toLowerCase());
            while(st.hasMoreTokens()) 
            {
                String s = st.nextToken();
                text.add(s);
            }

            System.out.println("\n" + "Words Printed out in Ascending "
                                + "(alphabetical) order: " + "\n");

            HashSet<String> set = new HashSet<>(text);
            List<String> arrayList = new ArrayList<>(set);
            Collections.sort(arrayList);
            for (Object ob : arrayList)
                System.out.println("\t" + ob.toString());
    }
}

Answer 1

你的副本可能在这里被删除了

HashSet<String> set = new HashSet<>(text);

一set通常不包含重复项，所以我只是对text数组列表进行排序

Collections.sort(text);
for (Object ob : text)
    System.out.println("\t" + ob.toString());

Answer 2

问题出在这里：

HashSet<String> set = new HashSet<>(text);

Set不包含重复项。

您应该使用以下代码：

    //HashSet<String> set = new HashSet<>(text);
    List<String> arrayList = new ArrayList<>(text);
    Collections.sort(arrayList);

另外对于拆分方法，我建议你使用：

s.split("[\\s\\.,;:\\?!]+");

例如，考虑下面给出的代码：

String s = "Abcdef;Ad; country hahahahah?           ad! \n alsj;d;lajfa try.... wait, which wish work";
String sp[] = s.split("[\\s\\.,;:\\?!]+");
for (String sr : sp )
{
    System.out.println(sr);
}

其输出如下：

Abcdef
Ad
country
hahahahah
ad
alsj
d
lajfa
try
wait
which
wish
work

我似乎无法弄清楚如何打印所有包括重复的单词

问题描述

2 个解决方案

解决方案1
1 已采纳 2013-04-17 17:54:07

解决方案2
1 2013-04-17 17:55:01

我似乎无法弄清楚如何打印所有包括重复的单词

问题描述

2 个解决方案

解决方案1 1 已采纳 2013-04-17 17:54:07

解决方案2 1 2013-04-17 17:55:01

解决方案1
1 已采纳 2013-04-17 17:54:07

解决方案2
1 2013-04-17 17:55:01