[英]Java Separate a String using a List of Words
How could I separate a String using a pre-given List of Strings, separating them by spaces?如何使用预先给定的字符串列表来分隔字符串,并用空格分隔它们?
Eg:例如:
List of words: words = {"hello", "how", "are", "you"}
单词列表:
words = {"hello", "how", "are", "you"}
The string I want to separate: text = "hellohowareyou"
我要分隔的字符串:
text = "hellohowareyou"
public static String separateText(String text, List<String> words) {
String new_text;
for (String word : words) {
if (text.startsWith(word)) {
String suffix = text.substring(word.length()); //'suffix' is the 'text' without it's first word
new_text += " " + word; //add the first word of the 'string'
separateString(suffix, words);
}
}
return new_text;
}
And new_text
should return hello how are you
new_text
应该返回hello how are you
Note that the order of the List words
could be different and also have more words, like a dictionary.请注意,列表
words
的顺序可能不同,并且有更多单词,例如字典。
How could I make this recursion, if needed?如果需要,我怎样才能进行这种递归?
This solution is pretty simple, but it is not memory optimal, because many new String
is created.这个解决方案非常简单,但它不是 memory 最优的,因为创建了许多新
String
。
public static String separate(String str, Set<String> words) {
for (String word : words)
str = str.replace(word, word + ' ');
return str.trim();
}
Demo演示
Set<String> words = Set.of("hello", "how", "are", "you");
System.out.println(separate("wow hellohowareyouhellohowareyou", words));
// wow hello how are you hello how are you
Another solution, with StringBuilder
and looks better to me from the performance view.另一个解决方案,使用
StringBuilder
并且从性能视图对我来说看起来更好。
public static String separate(String str, Set<String> words) {
List<String> res = new LinkedList<>();
StringBuilder buf = new StringBuilder();
for (int i = 0; i < str.length(); i++) {
buf.append(str.charAt(i));
if (str.charAt(i) == ' ' || words.contains(buf.toString().trim())) {
res.add(buf.toString().trim());
buf.delete(0, buf.length());
}
}
return String.join(" ", res);
}
This should do what you want这应该做你想做的
public static String separateText(String text, List<String> words){
StringBuilder newTextBuilder = new StringBuilder();
outerLoop:
while(text.length() > 0){
for(String word : words){
if(text.startsWith(word)){
newTextBuilder.append(word + " ");
text = text.substring(word.length());
continue outerLoop;
}
}
}
return newTextBuilder.toString();
}
}
How could I separate a String using a pre-given List of Strings, separating them by spaces?
如何使用预先给定的字符串列表来分隔字符串,并用空格分隔它们?
Pretty much how you already started.几乎你已经开始了。 Checking if the remaining text starts with any of the words from the list, remove the starting word and keep the suffix.
检查剩余文本是否以列表中的任何单词开头,删除起始单词并保留后缀。
You did all that already, but instead of just keeping the suffix and keep iterating you decided to try to call separateText
recursively.您已经完成了所有这些操作,但是您决定尝试递归地调用
separateText
,而不是仅仅保留后缀并继续迭代。
That is also a possibility, but even just normally iterating in a while loop until the suffix (or remaining text) is empty is enough.这也是一种可能,但即使只是正常地在 while 循环中迭代直到后缀(或剩余的文本)为空就足够了。
public String separateText(String text, List<String> words){
String new_text = "";
while (!text.isEmpty()) {
for (String word : words) {
if (text.startsWith(word)) {
// 'text' becomes previous 'text' without its first word
text = text.substring(word.length());
new_text += " " + word; // add the first word of the 'string'
}
}
}
return new_text;
}
here is a possible recursive solution.这是一个可能的递归解决方案。
that will cover the use case when the list of words contains 'hell' and 'hello' you decide if to use the word or not and the stop condition is if all words in a new string exist inside the word array这将涵盖当单词列表包含“hell”和“hello”时的用例,您决定是否使用该单词并且停止条件是新字符串中的所有单词是否都存在于单词数组中
public class main {
public static String separateWords(String seed, List<String> dictionary, int index) {
if (index == dictionary.size() - 1) {
String[] words = Arrays.stream(seed.split(" ")).filter(word -> !dictionary.contains(word)).toArray(String[]::new);
if (words.length == 0) return seed;
else return "";
}
String word = dictionary.get(index);
String current = seed.replaceFirst(word, word + " ");
String withWord = separateWords(current, dictionary, index + 1);
String withoutWord = separateWords(seed, dictionary, index + 1);
if (withoutWord.length() > withWord.length()) return withoutWord;
return withWord;
}
public static void main(String[] args) {
List<String> words = List.of(new String[]{"hello", "how", "are", "you"});
String text = "hellohowareyou";
String result = separateWords(text,words,0);
System.out.printf(result);
}
}
For a recursive method try the following:对于递归方法,请尝试以下操作:
public static String separateText(String text, List<String> words){
return separateText(text, words, new StringBuilder());
}
public static String separateText(String text, List<String> words, StringBuilder result){
for(String word : words){
if (text.startsWith(word)){
result.append(word).append(" ");
text = text.substring(word.length());
ArrayList<String> newList = new ArrayList<>(words);
newList.remove(word);
separateText(text, newList, result);
break;
}
}
return result.toString().trim();
}
import java.util.*;
public class Main {
public static void main(String[] args) throws Exception {
// You must sort this by it's length, or you will not have correct result
// since it may cause match with more shorter words.
// In this example, it's done
List<String> words = Arrays.asList("hello", "how", "are", "you");
List<String> detectedWords = new ArrayList<>();
String text = "hellohowareyou";
int i = 0;
while (i < text.length()) {
Optional<String> wordOpt = Optional.empty();
for (String word : words) {
if (text.indexOf(word, i) >= 0) {
wordOpt = Optional.of(word);
break;
}
}
if (wordOpt.isPresent()) {
String wordFound = wordOpt.get();
i += wordFound.length();
detectedWords.add(wordFound);
}
}
String result = String.join(" ", detectedWords);
System.out.println(result);
}
}
I assumed:我以为:
null
null
^(hello|how|are|you)$
^(hello|how|are|you)$
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.