簡體   English   中英

根據長度分割並添加字符串

[英]Split and add the string based on length

我有一個段落作為輸入字符串。 我正在嘗試將段落拆分為句子數組,其中每個元素包含的確切句子不超過250個字符。

我試過根據deliminator(as)分割字符串。 將所有字符串轉換為列表。 使用StringBuilder,我試圖根據長度(250個字符)追加字符串。

    List<String> list = new ArrayList<String>();

    String text = "Perhaps far exposed age effects. Now distrusts you her delivered applauded affection out sincerity. As tolerably recommend shameless unfeeling he objection consisted. She although cheerful perceive screened throwing met not eat distance. Viewing hastily or written dearest elderly up weather it as. So direction so sweetness or extremity at daughters. Provided put unpacked now but bringing. Unpleasant astonished an diminution up partiality. Noisy an their of meant. Death means up civil do an offer wound of. Called square an in afraid direct. Resolution diminution conviction so mr at unpleasing simplicity no. No it as breakfast up conveying earnestly immediate principle. Him son disposed produced humoured overcame she bachelor improved. Studied however out wishing but inhabit fortune windows. ";

    Pattern re = Pattern.compile("[^.!?\\s][^.!?]*(?:[.!?](?!['\"]?\\s|$)[^.!?]*)*[.!?]?['\"]?(?=\\s|$)",
            Pattern.MULTILINE | Pattern.COMMENTS);

    Matcher reMatcher = re.matcher(text);
    while (reMatcher.find()) {
        list.add(reMatcher.group());
    }
    String textDelimted[] = new String[list.size()];
    textDelimted = list.toArray(textDelimted);

    StringBuilder stringB = new StringBuilder(100);

    for (int i = 0; i < textDelimted.length; i++) {
        while (stringB.length() + textDelimted[i].length() < 250)
            stringB.append(textDelimted[i]);

        System.out.println("!#@#$%" +stringB.toString());
    }
}

預期結果:

[0]:可能是年齡效應暴露在外。 現在不信任您,她以誠摯的態度表示贊賞。 作為可以容忍的建議,他提出了反對的無恥之情。 她雖然開朗地感覺到了被篩選的投擲遇到了不吃飯的距離。

[1]:匆忙觀看或書面記錄最親愛的老人。 所以要對女兒說些甜蜜或極端。 提供,現在拆包但帶。 令人不快的是減少了偏見。 嘈雜的意思。

[2]:死亡意味着平民受傷。 叫方安害怕直接。 降低分辨率的信念使先生不那么簡單。 不,它作為早餐傳達了認真的直接原則。

[3]兒子的性格幽默使她克服了學士學位的提高。 但是,學習的卻是希望,卻居住在財富窗口中。

您的問題尚不清楚,請嘗試重新措辭以使您的問題完全清楚。

話雖這么說,我假設“我嘗試根據分隔符(如。)分割字符串。將所有字符串轉換為列表”表示您希望每當“。”時分割一個String 出現,並轉換為List<String> 可以按照以下步驟完成:

String input = "hello.world.with.delimiters";
String[] words = input.split("\\.");  // String[] with contents {"hello", "world", "with", "delimiters"}
List<String> list = Arrays.asList(words);  // Identical contents, just in a List<String>


// if you want to append to a StringBuilder based on length
StringBuilder sb = new StringBuilder();
for (String s : list) {
    if (someLengthCondition(s.length())) sb.append(list);
}

當然,您對someLengthCondition()將取決於您想要的內容。 我無法提供您的答案,因為您很難理解您要做什么。

我認為您只需要稍微修改一下循環即可。 我的結果匹配。

import java.util.List;
import java.util.ArrayList;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class MyClass {
    public static void main(String args[]) {

        List<String> list = new ArrayList<String>();

        String text = "Perhaps far exposed age effects. Now distrusts you her delivered applauded affection out sincerity. As tolerably recommend shameless unfeeling he objection consisted. She although cheerful perceive screened throwing met not eat distance. Viewing hastily or written dearest elderly up weather it as. So direction so sweetness or extremity at daughters. Provided put unpacked now but bringing. Unpleasant astonished an diminution up partiality. Noisy an their of meant. Death means up civil do an offer wound of. Called square an in afraid direct. Resolution diminution conviction so mr at unpleasing simplicity no. No it as breakfast up conveying earnestly immediate principle. Him son disposed produced humoured overcame she bachelor improved. Studied however out wishing but inhabit fortune windows. ";

        Pattern re = Pattern.compile("[^.!?\\s][^.!?]*(?:[.!?](?!['\"]?\\s|$)[^.!?]*)*[.!?]?['\"]?(?=\\s|$)",
                Pattern.MULTILINE | Pattern.COMMENTS);

        Matcher reMatcher = re.matcher(text);
        while (reMatcher.find()) {
            list.add(reMatcher.group());
        }
        String textDelimted[] = new String[list.size()];
        textDelimted = list.toArray(textDelimted);

        StringBuilder stringB = new StringBuilder(300);

        for (int i = 0; i < textDelimted.length; i++) {
            if(stringB.length() + textDelimted[i].length() < 250) {
                stringB.append(textDelimted[i]);
            } else {
                System.out.println("!#@#$%" +stringB.toString());
                stringB = new StringBuilder(300);
                stringB.append(textDelimted[i]);
            }

        }
        System.out.println("!#@#$%" +stringB.toString());
    }
}

用以下代碼替換println以獲得結果列表:

ArrayList<String> arrlist = new ArrayList<String>(5);
..
arrlist.add(stringB.toString());
..

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM