简体   繁体   English

如何在包含句子的字符串数组中搜索确切的单词(JAVA)

[英]How do I search for exact word in a String array which contain sentences(JAVA)

I have a String array containing n number of elements desired by user.我有一个字符串数组,其中包含用户所需的 n 个元素。

Suppose if there are 3 String elements:假设如果有 3 个 String 元素:

Hey,
Hello there,
Hell no

And I want to search for the word Hell .我想搜索“ Hell这个词。

The program should give out the third sentence only not the second sentence since hello has the word hell in it.程序应该只给出第三个句子而不是第二个句子,因为hellohell这个词。

Another example - elements are:另一个例子 - 元素是:

10
50
110

If I search for 10 the output should be the first sentence and not third one (Since 110 contains 10).如果我搜索 10,则输出应该是第一句而不是第三句(因为 110 包含 10)。

I have created a linear search array for String but I don't get how to implement it on words in sentences.我为 String 创建了一个线性搜索数组,但我不知道如何在句子中的单词上实现它。

Help would be appreciated.Thank you.帮助将不胜感激。谢谢。

The equals method is a better fit for your requirement : equals方法更适合您的要求:

String strArray[] = { "Hey", "Hello there", "Hell no" };
String inputStr = "Hell";

for (int i = 0; i < strArray.length; i++) {
    String[] contents = strArray[i].split(" ");
    for (int j = 0; j < contents.length; j++) {
        if (inputStr.equals(contents[j])) {
            System.out.println(strArray[i]);
        }
    }
}

Here, we iterate over the initial array, split each word and then loop over the resulting array to check if there is a match.在这里,我们迭代初始数组,拆分每个单词,然后遍历结果数组以检查是否存在匹配项。

Hope this help:希望这有帮助:

package demo;

import java.util.Arrays;
import java.util.List;
import java.util.logging.Logger;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import java.util.stream.Stream;


enum SearchType {
    EXACTWORD,
    EXACT_NON_WORD,
    IN_THE_BEGINNING,
    IN_THE_END,
    INSIDE
}
public class Main {
    private final static Logger LOGGER =  
            Logger.getLogger(Logger.GLOBAL_LOGGER_NAME); 
    public static void main(String[] args) {

        // Demo with Strings
        String myText = "a,abc,abcd, abd def,abc, fgh,erf abc,Hey, Hello there, Hell no";
        String separator = ",";
        String toFind = "abc";


        System.out.println("\n------------------------------------------");
        System.out.println("SEARCHING TO: " + toFind);
        System.out.println("WITH : " + myText);
        System.out.println("------------------------------------------");
        for (SearchType searchType : SearchType.values()) { 
            //LOGGER.log(Level.INFO, "Search Type: " + searchType);
            String result = filterInputByRegex(myText, separator, toFind, searchType);
            System.out.println("\tResults for " 
                    + searchType + " >\t" 
                    + (!result.isEmpty() ? result : "There is no match"));
            System.out.println("matches indexes: " + Arrays.toString(searchIndexes(myText, separator, toFind, searchType)));
            //LOGGER.log(Level.INFO, "matches indexes: " + Arrays.toString(searchIndexes(myText, separator, toFind, searchType)));
        }

        // Demo with integers
        myText = "  10  01   100   121 110 010 120 11";
        separator = " ";
        toFind = "10";

        System.out.println("\n------------------------------------------");
        System.out.println("SEARCHING TO: " + toFind);
        System.out.println("WITH : " + myText);
        System.out.println("------------------------------------------");
        for (SearchType searchType : SearchType.values()) { 
            //LOGGER.log(Level.INFO, "Search Type: " + searchType);
            String result = filterInputByRegex(myText, separator, toFind, searchType);
            System.out.println("\tResults for " 
                    + searchType + " >\t" 
                    + (!result.isEmpty() ? result : "There is no match"));
            System.out.println("matches indexes: " + Arrays.toString(searchIndexes(myText, separator, toFind, searchType)));
            //LOGGER.log(Level.INFO, "matches indexes: " + Arrays.toString(searchIndexes(myText, separator, toFind, searchType)));
        }
    }

    /**
     * test regex
     * @param regex
     * @param text
     * @return
     */
    public static boolean matches(String regex, String text) {
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(text);
        return matcher.find();
    }
    /**
     *  Prepare Regex by enum searchType (exact word, non exact word, in the beginning, etc.)
     * @param search
     * @param searchType
     * @return
     */
    public static String prepareRegex(String search, SearchType searchType) {
        String text = "";
        switch(searchType) {
            case EXACTWORD:
                text = ".*\\b" + search + "\\b.*";
                break;
            case EXACT_NON_WORD:
                text = ".*\\B" + search + "\\B.*";
                break;
            case IN_THE_BEGINNING:
                text = "\\A" + search + ".*";
                break;
            case IN_THE_END:
                text = ".*" + search + "\\z";
                break;
            case INSIDE:
                text = ".*" + search + ".*";
                break;
        }
        return text;
    }
    /**
     * Split String to List
     * @param input
     * @param separator "," for String or " " for integer list;
     * @return
     */
    public static List<String> splitToListString(String input, String separator){
        return (List<String>) Stream.of(input.split(separator))
                .filter(str -> !str.isEmpty())
                .map(elem -> new String(elem))
                .collect(Collectors.toList());
    }

    /**
     * Join List to String (only for demo)
     * @param input
     * @param separator
     * @return
     */
    public static String joinStringListWithSeparator(List<String> input, String separator){
        return input.stream().collect(Collectors.joining(separator));

    }

    /**
     * Get Indexes of matching elements
     * @param input
     * @param separator
     * @param search
     * @param searchType
     * @return
     */
    public static int[] searchIndexes(String input, String separator, String search, SearchType searchType) {

        final String toFind = prepareRegex(search, searchType);

        List<String> sentences = splitToListString(input, separator);

        int[] indexesOfResults = IntStream
            .range(0,  sentences.size())
            .filter(index -> matches(toFind, sentences.get(index)))
            .toArray();

        return indexesOfResults;

    }

    /**
     * Filter List (generated from String) by Regex
     * @param input
     * @param separator
     * @param search
     * @param searchType
     * @return
     */
    public static String filterInputByRegex(String input, String separator, String search, SearchType searchType) {

        final String toFind = prepareRegex(search, searchType);

        List<String> sentences = splitToListString(input, separator);

        List<String> results = sentences
            .stream()
            .parallel()
            .filter(elem -> matches(toFind, elem))
            .collect(Collectors.toList());

        return joinStringListWithSeparator(results, separator);

    }
}

This demo will return for "abc" and "10" this:此演示将返回“abc”和“10”:


------------------------------------------
SEARCHING TO: abc
WITH : a,abc,abcd, abd def,abc, fgh,erf abc,Hey, Hello there, Hell no
------------------------------------------
    Results for EXACTWORD > abc,abc,erf abc
matches indexes: [1, 4, 6]
    Results for EXACT_NON_WORD >    There is no match
matches indexes: []
    Results for IN_THE_BEGINNING >  abc,abcd,abc
matches indexes: [1, 2, 4]
    Results for IN_THE_END >    abc,abc,erf abc
matches indexes: [1, 4, 6]
    Results for INSIDE >    abc,abcd,abc,erf abc
matches indexes: [1, 2, 4, 6]

------------------------------------------
SEARCHING TO: 10
WITH :   10  01   100   121 110 010 120 11
------------------------------------------
    Results for EXACTWORD > 10
matches indexes: [0]
    Results for EXACT_NON_WORD >    There is no match
matches indexes: []
    Results for IN_THE_BEGINNING >  10 100
matches indexes: [0, 2]
    Results for IN_THE_END >    10 110 010
matches indexes: [0, 4, 5]
    Results for INSIDE >    10 100 110 010
matches indexes: [0, 2, 4, 5]

Try this.尝试这个。 The key is to use a word boundary (\\b) in Java regexp:关键是在 Java regexp 中使用单词边界 (\\b):

System.out.println(Arrays.stream("Hey, Hello there,Hell no".split(","))
        .filter(s -> s.matches(".*\\bHell\\b.*"))
        .collect(Collectors.joining(","))
);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM