[英]Number of words in a string that are not in an array of strings
I want to create a method that returns the number of words in a string that have no occurrences of words in the array of strings. 我想创建一个方法,该方法返回字符串中没有出现字符串数组中单词的单词数。 I want to implement this logic only using anything in the java.lang package.
我只想使用java.lang包中的任何东西来实现此逻辑。
public int count(String a, String[] b) {
}
Eg 例如
count(" hey are you there ", new String[]{ "are", "i", "am"})
would return 3 as there is the word "are" in the string. 将返回3,因为字符串中有单词“ are”。
First off, I think I have to use the string.split function to convert the string to an array of strings. 首先,我认为我必须使用string.split函数将字符串转换为字符串数组。 Any ideas?
有任何想法吗?
You could simply do something like: 您可以简单地执行以下操作:
public int count(String a, String[] b) {
int count = b.length;
for(String s : b) if(a.contains(s)) count--;
return count;
}
EDIT: I might have been confused, I thought you wanted the # of strings in b
not in a
(in your example it would still be 3). 编辑:我可能已经很困惑,我以为您想要
b
不包含在a
的字符串数(在您的示例中,它仍然是3)。 In that case, from your example, split
seems inconvenient unless you use regex
, so you could create a String[]
using Scanner
: 在这种情况下,从您的示例来看,除非使用
regex
,否则split
似乎不方便,因此可以使用Scanner
创建String[]
:
public int count(String a, String[] b) {
ArrayList<String> words = new ArrayList<String>();
Scanner scan = new Scanner(a);
while(scan.hasNext()) words.add(scan.next());
int count = words.size();
for(String s : words) if(/*b contains s*/) count--;
return count;
}
You logic should go somewhat like this: 您的逻辑应该像这样:
Split a
, right. 拆分
a
,对。 Now you have a list of words. 现在您有了单词列表。 In a real life, you should probably also try to clarify the requirement—what exactly is a “word”?
在现实生活中,您可能还应该尝试阐明要求-“单词”到底是什么? A reasonable assumption is that it's a sequence of non-whitespace characters, but could be something different (for example, a sequence of letters).
一个合理的假设是它是一个非空白字符序列,但是可能有所不同(例如,一个字母序列)。
Iterate over a
and check whether each word is in b
. 遍历
a
并检查每个单词是否在b
。 If it isn't, increment your counter. 如果不是,请增加您的计数器。 But every check is a linear search in
b
, leading to the total complexity of O(nm), so... 但是每次检查都是在
b
进行线性搜索,从而导致O(nm)的总复杂度,因此...
Before iterating, convert b
into a HashSet
. 迭代之前,将
b
转换为HashSet
。 This is a linear operation, but then your main loop will also become a linear operation, therefore the total complexity will be O(m + n). 这是线性运算,但是您的主循环也将变为线性运算,因此总复杂度为O(m + n)。
If you have to do this thing repeatedly for different strings, but the same word list, consider creating a WordCounter
class so you only have to create the HashSet
once in the constructor. 如果必须对不同的字符串但在相同的单词列表中重复执行此操作,请考虑创建
WordCounter
类,这样您只需在构造函数中创建一次HashSet
。
Follow the steps to complete the task. 请按照以下步骤完成任务。
StringTokenizer
to tokenize the String a
. StringTokenizer
标记字符串a
。 b
to Collection
, so that you can check if it contains the given token. b
转换为Collection
,以便您可以检查它是否包含给定标记。 token
from StringTokenizer
and check if it contains in List
. StringTokenizer
获取下一个token
,并检查它是否包含在List
。 - --
Try below code, it'll work. 试试下面的代码,它将起作用。
EDIT : Using java.util
package. 编辑:使用
java.util
包。
public int count(String a, String[] b) {
java.util.StringTokenizer tokenizer = new java.util.StringTokenizer(a);
java.util.List bList = java.util.Arrays.asList(b);
int tokens = tokenizer.countTokens();
int counter = tokens;
for(int i=0;i<tokens;i++) {
String token = tokenizer.nextToken().trim();
if(bList.contains(token)) {
counter--;
}
}
return counter;
}
By using this, you can get the counter in just one for loop. 通过使用此功能,您可以在一个for循环中获得计数器。
EDIT :: Using java.lang
package only. 编辑::仅使用
java.lang
包。
public int count(String a, String[] b) {
String[] words = a.split(" ");
int tokens = words.length;
int wordCount = 0;
int counter = 0;
for(int i=0;i<tokens;i++) {
String token = words[i].trim();
if(token.length() <= 0) {
continue;
}
wordCount++;
for(String bItem : b) {
if(bItem.equals(token)) {
counter++;
break;
}
}
}
return wordCount - counter;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.