简体   繁体   English

如何在Java中将输入文件拆分为单词?

[英]How to split input file into words in Java?

I have to read text from an input file in Java, and first split it into lines and then words.我必须从 Java 中的输入文件中读取文本,然后首先将其拆分为行和单词。 This method used here, to my understanding is to store the words in a list.这里使用的这种方法,我的理解是将单词存储在列表中。 Can you explain me a bit more how this method works or how can I do it differently?你能解释一下这个方法是如何工作的,或者我如何以不同的方式来做? Thank you!谢谢!

import java.util.Arrays;

public class StoreLinesFormat {
    private ArrayList<ArrayList<String>> storeDataList;

    public StoreLinesFormat() {

    }

    public ArrayList<ArrayList<String>> readFormat(ArrayList<String> inputDataList) {
        ArrayList<String> data = inputDataList;
        if (data != null) {
            storeDataList = new ArrayList<ArrayList<String>>();

            for (String string : data) {
                ArrayList<String> inner = new ArrayList<String>(Arrays.asList(string.split(" ")));
                storeDataList.add(inner);
            }

            return storeDataList;

        } else {
            System.out.println("Array error detected. NULL array value.");
            return null;
        } 
    }

}

Well, let me break it down for you.好吧,让我为你分解一下。

public ArrayList<ArrayList<String>> readFormat(ArrayList<String> inputDataList)

This method takes an ArrayList as an argument, the ArrayList used to have a line stored of the file on its every index.此方法将 ArrayList 作为参数,ArrayList 曾经在其每个索引上存储文件的一行。

eg 1st Line of file = arrayList(1st index) 2nd Line of file = arrayList(2st index)例如文件的第一行 = arrayList(1st index) 文件的第二行 = arrayList(2st index)

So coming down,所以下来,

for (String string : data) {
                ArrayList<String> inner = new ArrayList<String>(Arrays.asList(string.split(" ")));
                storeDataList.add(inner);
            }

This foreach loops iterate each index and then separates each word with a blank space and creates a new list to store these separated words.这个 foreach 循环迭代每个索引,然后用空格分隔每个单词,并创建一个新列表来存储这些分隔的单词。

The result is an arrayList having a separate ArrayList on its every index, which have separated words on its every index.结果是一个 arrayList 在它的每个索引上都有一个单独的 ArrayList,在它的每个索引上都有单独的单词。

For an alternative - neat solution对于替代 - 简洁的解决方案

In case its too complex for you to handle, take a look at this solution如果它太复杂而您无法处理,请查看此解决方案

https://www.javacodex.com/Files/Read-File-Word-By-Word https://www.javacodex.com/Files/Read-File-Word-By-Word

inputDataList seems to be an ArrayList containing the file line by line. inputDataList似乎是一个ArrayList包含一行一行的文件。

Then inner gets, for every cycle of that for loop, each word (where "word" means the line separated at the space character, in this case), because split has been called on the single line you're analyzing inside that loop.然后,对于该for循环的每个循环, inner获取每个单词(在这种情况下,“单词”表示以空格字符分隔的行),因为在您正在该循环内分析的单行上调用了split

At that point the separated words are added en masse to storeDataList , and the cycle repeats for each element of the array (ie each line of the file).在这一点上,分隔的单词被一起添加到storeDataList ,并且对数组的每个元素(即文件的每一行)重复循环。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM