简体   繁体   English

不在字符串数组中的字符串中的单词数

[英]Number of words in a string that are not in an array of strings

I want to create a method that returns the number of words in a string that have no occurrences of words in the array of strings. 我想创建一个方法,该方法返回字符串中没有出现字符串数组中单词的单词数。 I want to implement this logic only using anything in the java.lang package. 我只想使用java.lang包中的任何东西来实现此逻辑。

public int count(String a, String[] b) {

}

Eg 例如

count("  hey   are  you there    ", new String[]{ "are", "i", "am"})

would return 3 as there is the word "are" in the string. 将返回3,因为字符串中有单词“ are”。

First off, I think I have to use the string.split function to convert the string to an array of strings. 首先,我认为我必须使用string.split函数将字符串转换为字符串数组。 Any ideas? 有任何想法吗?

You could simply do something like: 您可以简单地执行以下操作:

public int count(String a, String[] b) {
    int count = b.length;
    for(String s : b) if(a.contains(s)) count--;
    return count;
}

EDIT: I might have been confused, I thought you wanted the # of strings in b not in a (in your example it would still be 3). 编辑:我可能已经很困惑,我以为您想要b不包含在a的字符串数(在您的示例中,它仍然是3)。 In that case, from your example, split seems inconvenient unless you use regex , so you could create a String[] using Scanner : 在这种情况下,从您的示例来看,除非使用regex ,否则split似乎不方便,因此可以使用Scanner创建String[]

public int count(String a, String[] b) {
    ArrayList<String> words = new ArrayList<String>();
    Scanner scan = new Scanner(a);
    while(scan.hasNext()) words.add(scan.next());

    int count = words.size();
    for(String s : words) if(/*b contains s*/) count--;
    return count;
}

You logic should go somewhat like this: 您的逻辑应该像这样:

  1. Split a , right. 拆分a ,对。 Now you have a list of words. 现在您有了单词列表。 In a real life, you should probably also try to clarify the requirement—what exactly is a “word”? 在现实生活中,您可能还应该尝试阐明要求-“单词”到底是什么? A reasonable assumption is that it's a sequence of non-whitespace characters, but could be something different (for example, a sequence of letters). 一个合理的假设是它是一个非空白字符序列,但是可能有所不同(例如,一个字母序列)。

  2. Iterate over a and check whether each word is in b . 遍历a并检查每个单词是否在b If it isn't, increment your counter. 如果不是,请增加您的计数器。 But every check is a linear search in b , leading to the total complexity of O(nm), so... 但是每次检查都是在b进行线性搜索,从而导致O(nm)的总复杂度,因此...

  3. Before iterating, convert b into a HashSet . 迭代之前,将b转换为HashSet This is a linear operation, but then your main loop will also become a linear operation, therefore the total complexity will be O(m + n). 这是线性运算,但是您的主循环也将变为线性运算,因此总复杂度为O(m + n)。

  4. If you have to do this thing repeatedly for different strings, but the same word list, consider creating a WordCounter class so you only have to create the HashSet once in the constructor. 如果必须对不同的字符串但在相同的单词列表中重复执行此操作,请考虑创建WordCounter类,这样您只需在构造函数中创建一次HashSet

Follow the steps to complete the task. 请按照以下步骤完成任务。

  • Use StringTokenizer to tokenize the String a . 使用StringTokenizer标记字符串a
  • Convert String Array b to Collection , so that you can check if it contains the given token. 将String Array b转换为Collection ,以便您可以检查它是否包含给定标记。
  • Use loop to get next token from StringTokenizer and check if it contains in List . 使用循环从StringTokenizer获取下一个token ,并检查它是否包含在List

- --

Try below code, it'll work. 试试下面的代码,它将起作用。

EDIT : Using java.util package. 编辑:使用java.util包。

public int count(String a, String[] b) {
    java.util.StringTokenizer tokenizer = new java.util.StringTokenizer(a);
    java.util.List bList = java.util.Arrays.asList(b);
    int tokens = tokenizer.countTokens();
    int counter = tokens;
    for(int i=0;i<tokens;i++) {
        String token = tokenizer.nextToken().trim();
        if(bList.contains(token)) {
            counter--;
        }
    }
    return counter;
}

By using this, you can get the counter in just one for loop. 通过使用此功能,您可以在一个for循环中获得计数器。

EDIT :: Using java.lang package only. 编辑::仅使用java.lang包。

public int count(String a, String[] b) {
    String[] words = a.split(" ");
    int tokens = words.length;
    int wordCount = 0;
    int counter = 0;
    for(int i=0;i<tokens;i++) {
        String token = words[i].trim();
        if(token.length() <= 0) {
            continue;
        }
        wordCount++;
        for(String bItem : b) {
            if(bItem.equals(token)) {
                counter++;
                break;
            }
        }
    }
    return wordCount - counter;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 计算字符串数组中每个单词最后一次出现后的单词数 - counting number of words after last occurrence of every word in an array of strings 在Java中的字符串数组中随机播放特定数量的字符串 - Shuffle a specific number of strings in a String Array in Java 查找包含数组中所有单词的字符串子字符串 - Finding Sub-Strings of String Containing all the words in array 给定一个字符串数组,单词,则返回该数组,将所有长度均等的字符串替换为空字符串 - Given an array of Strings, words, return that array with all Strings of an even length replaced with an empty string +计算字符串中的单词数 - +count the number of words in string 如何检查包含字符串的数组中是否有另一个字符串数组中的某些单词 - How to check if an array containing strings, has certain words in it from another string array Java程序从控制台读取字符串并在其中打印带有最大单词数的字符串 - Java program to read Strings from console and print the String with maximum number of words in it 要求用户输入特定数量的字符串然后将每个字符串添加到数组中? - Asking user to enter specific number of strings then adding each string to array? 将单词放入字符串数组 - Put words in a string array 计算字符串中的字符串数 - Count the number of Strings in a String
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM