简体   繁体   English

从 .txt 文件中抓取单词 - Java

[英]Grab Words from .txt file - Java

I have been tasked to grab 100 words, at random, from a dictionary.txt file.我的任务是从一个 dictionary.txt 文件中随机抓取 100 个单词。 I have been able to read to the file using a scanner, populate an array based on each new line (which would separate the words + definitions to a single element each) and then formatted it to remove the brackets.我已经能够使用扫描仪读取文件,根据每个新行填充一个数组(这会将单词 + 定义分别分隔为一个元素),然后对其进行格式化以删除括号。 However, now I need to figure out how to grab just the first word from each array element, possibly by using a regex or the fact that it is the first word in each element being ended by a space.但是,现在我需要弄清楚如何从每个数组元素中获取第一个单词,可能是通过使用正则表达式或者它是每个元素中以空格结尾的第一个单词的事实。

My question is how would one go about grabbing just the first word out of every array element, or, as mentioned below, grab the first word per line.我的问题是如何从每个数组元素中只抓取第一个单词,或者,如下所述,抓取每行的第一个单词。

You probably just want to use yourString.split(" ")[0] .您可能只想使用yourString.split(" ")[0]

However, I believe constructing an array of all the lines is wasteful.但是,我认为构建所有行的数组是浪费的。 You could instead construct an array of the first words of the file using a Scanner, or you could even do a first parse to count the number of lines and then only construct the final desired result.您可以改为使用 Scanner 构建文件的第一个单词的数组,或者您甚至可以进行第一次解析以计算行数,然后仅构建最终所需的结果。

Oh and a last edit for your regex culture : the appropriate regex would have been ^\\S+ which grabs all non-space characters at the start of a string.哦,对您的正则表达式文化进行最后一次编辑:适当的正则表达式应该是^\\S+ ,它会在字符串的开头抓取所有非空格字符。 It's most probably less efficient than using String.split() .它很可能比使用String.split()效率低。

The easiest way to grab the first word from each line would be by using indexOf() and substring() :从每一行中获取第一个单词的最简单方法是使用indexOf()substring()

String[] lines = new String[100];
String[] words = new String[100];
int x;

//your code for getting the lines of text from the file goes here

for (int i=0; i<lines.length; i++) {
    x = lines[i].indexOf(" ");
    words[i] = lines[i].substring(0, x);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM