简体   繁体   English

Java 使用单词列表分隔字符串

[英]Java Separate a String using a List of Words

How could I separate a String using a pre-given List of Strings, separating them by spaces?如何使用预先给定的字符串列表来分隔字符串,并用空格分隔它们?

Eg:例如:

List of words: words = {"hello", "how", "are", "you"}单词列表: words = {"hello", "how", "are", "you"}

The string I want to separate: text = "hellohowareyou"我要分隔的字符串: text = "hellohowareyou"

public static String separateText(String text, List<String> words) {
    String new_text;

    for (String word : words) {
        if (text.startsWith(word)) {
            String suffix = text.substring(word.length());  //'suffix' is the 'text' without it's first word
            new_text += " " + word;  //add the first word of the 'string'
            separateString(suffix, words);
        }
    }
    
    return new_text;
}

And new_text should return hello how are you new_text应该返回hello how are you

Note that the order of the List words could be different and also have more words, like a dictionary.请注意,列表words的顺序可能不同,并且有更多单词,例如字典。

How could I make this recursion, if needed?如果需要,我怎样才能进行这种递归?

This solution is pretty simple, but it is not memory optimal, because many new String is created.这个解决方案非常简单,但它不是 memory 最优的,因为创建了许多新String

public static String separate(String str, Set<String> words) {
    for (String word : words)
        str = str.replace(word, word + ' ');

    return str.trim();
}

Demo演示

Set<String> words = Set.of("hello", "how", "are", "you");
System.out.println(separate("wow hellohowareyouhellohowareyou", words));
// wow hello how are you hello how are you

Another solution, with StringBuilder and looks better to me from the performance view.另一个解决方案,使用StringBuilder并且从性能视图对我来说看起来更好。

public static String separate(String str, Set<String> words) {
    List<String> res = new LinkedList<>();
    StringBuilder buf = new StringBuilder();

    for (int i = 0; i < str.length(); i++) {
        buf.append(str.charAt(i));

        if (str.charAt(i) == ' ' || words.contains(buf.toString().trim())) {
            res.add(buf.toString().trim());
            buf.delete(0, buf.length());
        }
    }

    return String.join(" ", res);
}

This should do what you want这应该做你想做的

public static String separateText(String text, List<String> words){
        StringBuilder newTextBuilder = new StringBuilder();

        outerLoop:
        while(text.length() > 0){
            for(String word : words){
                if(text.startsWith(word)){
                    newTextBuilder.append(word + " ");
                    text = text.substring(word.length());
                    continue outerLoop;
                }
            }
        }

        return newTextBuilder.toString();
    }
}

How could I separate a String using a pre-given List of Strings, separating them by spaces?如何使用预先给定的字符串列表来分隔字符串,并用空格分隔它们?

Pretty much how you already started.几乎你已经开始了。 Checking if the remaining text starts with any of the words from the list, remove the starting word and keep the suffix.检查剩余文本是否以列表中的任何单词开头,删除起始单词并保留后缀。

You did all that already, but instead of just keeping the suffix and keep iterating you decided to try to call separateText recursively.您已经完成了所有这些操作,但是您决定尝试递归地调用separateText ,而不是仅仅保留后缀并继续迭代。

That is also a possibility, but even just normally iterating in a while loop until the suffix (or remaining text) is empty is enough.这也是一种可能,但即使只是正常地在 while 循环中迭代直到后缀(或剩余的文本)为空就足够了。

    public String separateText(String text, List<String> words){

        String new_text = "";

        while (!text.isEmpty()) {
            for (String word : words) {
                if (text.startsWith(word)) {
                    // 'text' becomes previous 'text' without its first word
                    text = text.substring(word.length());  
                    new_text += " " + word;  // add the first word of the 'string'
                }
            }
        }

        return new_text;
    }

here is a possible recursive solution.这是一个可能的递归解决方案。

that will cover the use case when the list of words contains 'hell' and 'hello' you decide if to use the word or not and the stop condition is if all words in a new string exist inside the word array这将涵盖当单词列表包含“hell”和“hello”时的用例,您决定是否使用该单词并且停止条件是新字符串中的所有单词是否都存在于单词数组中

public class main {

    public static String separateWords(String seed, List<String> dictionary, int index) {

        if (index == dictionary.size() - 1) {
            String[] words = Arrays.stream(seed.split(" ")).filter(word -> !dictionary.contains(word)).toArray(String[]::new);
            if (words.length == 0) return seed;
            else return "";
        }
        String word = dictionary.get(index);
        String current = seed.replaceFirst(word, word + " ");
        String withWord = separateWords(current, dictionary, index + 1);
        String withoutWord = separateWords(seed, dictionary, index + 1);
        if (withoutWord.length() > withWord.length()) return withoutWord;
        return withWord;


    }

    public static void main(String[] args) {
        List<String> words = List.of(new String[]{"hello", "how", "are", "you"});
        String text = "hellohowareyou";
        String result = separateWords(text,words,0);
        System.out.printf(result);
    }
}

For a recursive method try the following:对于递归方法,请尝试以下操作:

public static String separateText(String text, List<String> words){
    return separateText(text, words, new StringBuilder());
}

public static String separateText(String text, List<String> words, StringBuilder result){

    for(String word : words){
        if (text.startsWith(word)){
           result.append(word).append(" ");
           text = text.substring(word.length());
           ArrayList<String> newList = new ArrayList<>(words);
           newList.remove(word);
           separateText(text, newList, result);
           break;
        }
    }

    return result.toString().trim();
}
import java.util.*;

public class Main {
    public static void main(String[] args) throws Exception {
        // You must sort this by it's length, or you will not have correct result
        // since it may cause match with more shorter words.
        // In this example, it's done
        List<String> words = Arrays.asList("hello", "how", "are", "you");
        List<String> detectedWords = new ArrayList<>();
        String text = "hellohowareyou";
        int i = 0;
        while (i < text.length()) {
            Optional<String> wordOpt = Optional.empty();

            for (String word : words) {
                if (text.indexOf(word, i) >= 0) {
                    wordOpt = Optional.of(word);
                    break;
                }
            }
            if (wordOpt.isPresent()) {
                String wordFound = wordOpt.get();
                i += wordFound.length();
                detectedWords.add(wordFound);
            }
        }
        String result = String.join(" ", detectedWords);
        System.out.println(result);
    }
}

I assumed:我以为:

  • Your text never will be null你的文字永远不会是null
  • Your text matches regex ^(hello|how|are|you)$您的文本匹配正则表达式^(hello|how|are|you)$
  • Your words must be sorted你的话必须排序

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Java中使用split()字符串方法分隔单词 - How do I separate words using split() string method in Java 单独列表<String>在 Java 中 - Separate List<String> in Java 如何使用流api将单独的单词串转换为单独的列表元素 - How convert string of separate words to separate elements of list by using stream api 用Java中的字符串将特定单词分开 - Separate specific words from a string in java 我如何在 Java 中使用字符串反转大写和逐行分隔单词? - How do i Reverse Capitalize And Separate Words By Line Using String In Java? 如何将Java中一个字符串的单词与另一个字符串的单词进行比较,并在匹配时将单词分开? - How to compare words of one string in Java with words of another string and separate the words on match? 分离字符串并将字符串值发送到Java中的列表 - Separate the string and send the string values to list in java 如何从字符串中分离出许多不同的单词(Java) - How to separate many different words from a string (Java) java如何将字符串中的单词分开并将每个单独的单词存储在变量中 - java how to separate words in a string and store each seperate word in a variable Java:如何将同一字符串中的两个单词输出到单独的行上? - Java: How to output two words in the same string onto separate lines?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM