简体   繁体   English

如何将 Java 字符串一分为二,其中第一个子字符串不长于 x 并以整个单词结尾

[英]How to divide a Java String into two where the first substring is no longer than x and ends with a whole word

I'm at a loss with dividing a string into 2 substrings.将一个字符串分成 2 个子字符串,我不知所措。 The first substring's length should be no more than 35 and it should end with the end of the word.第一个子串的长度不应超过 35,并应以词尾结尾。 So, if the 35 limit falls mid-word, then break the string when this word starts (let's say on 32).因此,如果 35 限制落在单词中间,则在该单词开始时断开字符串(假设为 32)。 by word I mean any combo of non-space characters.我的意思是任何非空格字符的组合。 words are divided by spaces.单词用空格分隔。 The second substring can be of any length and, consequently, should start with the start of a word.第二个子串可以是任意长度,因此应该以单词的开头开始。 The string is always bigger than 35 and doesn't have a pattern.字符串总是大于 35 并且没有模式。 How can I implement it?我该如何实施? Thanks in advance!提前致谢!

Example:例子:

"Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat." “Lordem ipsum dolor sat amet,$200 cons(35 个字符直到这里)ectetur adipiscing elit,№22sed 70 % do eiusmod tempor incididunt ut Labore et dolore magna aliqua。Ut enim ad minim veniam,quis nostrud exeralicit exeraliitation ulisquinip commodo 结果。”

This is a long String.这是一个长字符串。 Then i need to get to strings : "Lordem ipsum dolor sit amet, $200" (fewer than 35 and ends where word ends) and the rest into one big separate substring然后我需要进入字符串: "Lordem ipsum dolor sit amet, $200" (少于 35 个并在单词结束处结束),其余的放入一个单独的大子字符串中

You can use StringTokenizer:您可以使用 StringTokenizer:

import java.util.Arrays;
import java.util.StringTokenizer;

public class Test {

    public static void main(String[] args){
        String str = "Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.";
        StringTokenizer strToken  = new StringTokenizer(str," ",true);
        String first  = "";
        String second = "";

        while(strToken.hasMoreTokens()){
            String next = strToken.nextToken();
            if((first+next).length() < 35){
                first += next;
            }
            else{
                break;
            }
            second = str.substring(first.length());
        }
        System.out.println(first);
        System.out.println(second);
    }
}

Or if you are on java 9 or higher and want to try streams :或者,如果您使用的是 java 9 或更高版本并想尝试流:

import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Collectors;

public class Test {

    public static void main(String[] args){
        String str = "Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.";

        //split at space and keep delimiters
        String[] splited = str.split("((?<= )|(?= ))");

        AtomicInteger ai = new AtomicInteger(0);
        String f = Arrays.stream(splited).takeWhile(i -> ai.addAndGet(i.length()) < 35).collect(Collectors.joining());

        AtomicInteger bi = new AtomicInteger(0);
        String s = Arrays.stream(splited).dropWhile(i -> bi.addAndGet(i.length()) < 35).collect(Collectors.joining());

        System.out.println(f);
        System.out.println(s);
    }
}

You can use the following approach with an input of 35 to get the desired result.您可以使用以下方法输入35来获得所需的结果。

public static String[] splitAtLengthOrBeforeWord(String s, int length) {
    if(length < 0) {
        throw new IllegalArgumentException("length must be greater than 0");
    }

    if(s.length() < length) {
        return new String[] { s, "" };
    }
    
    for(int i = length - 1; i >= 0; i--) {
        int c = s.charAt(i);
        if(Character.isWhitespace(c)) {
            return new String[] { s.substring(0, i), s.substring(i) };
        }
    }
    return new String[] { "", s };
}

You can use lastindexOf method from string class, first check if character at 35 index is space just simple split else you can split on 35 and get last index of space that index will give you start of word and that is what we trying to figure out.您可以使用字符串类中的 lastindexOf 方法,首先检查 35 索引处的字符是否只是简单拆分,否则您可以在 35 上拆分并获得最后一个空格索引,该索引将为您提供单词的开头,这就是我们试图弄清楚的. Below is code working on this logic.下面是处理这个逻辑的代码。 You can add other safety checks as required.您可以根据需要添加其他安全检查。

    public static void main(String[] args) {
    String str = "Lordem ipsum dolor sit amet, $200 cons(35 chars until here)ectetur adipiscing elit, №22sed 70 % do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.";
    String str1, str2 = "";
    if (str.charAt(35) == ' ') {
        str1 = str.substring(0, 35);
        str2 = str.substring(36, str.length());
    }
    else {
        String temp = str.substring(0, 35);
        int ind = temp.lastIndexOf(' ');
        str1 = str.substring(0, ind);
        str2 = str.substring(ind + 1, str.length());
    }
    System.out.println(str1);
    System.out.println(str2);
   }

}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM