簡體   English   中英

Java字符串分段位於第n個位置

[英]Java String Segmentation at nth position

我在Java中的代碼和我有一個長文本(最多500個字符),並且我想對此文本進行某種細分,例如,在每個細分中我只需要6個字符:這是一個示例文本:

String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";

我想要這個結果:

第1段:敘利亞

segment2:offici

Segment3:盟友k

段n:……

我已經嘗試過for循環,但是沒有達到目標。而且我也遇到了錯誤

java.lang.StringIndexOutOfBoundsException: length=67; regionStart=65; regionLength=5

這是我的代碼:

  String msg = fullText; for(int i=-1 ; i <= fullText.length()+1; i++){ int len = msg.length(); text = new StringBuilder().append(msgInfo).append(msg.substring(i, i + 6)).toString(); msg = new StringBuilder().append(msg.substring(i +5, len)).toString(); LogHelper.d(TAG, "teeeeeeeeeeeeext:"+i +" .."+ text); } 

我該如何正確進行細分? 謝謝!

您處在正確的軌道上,但是您已經使這一過程變得復雜了。

試試這個

int segmentSize = 6;
String[] segments = new String[msg.length() / segmentSize + 1];

for (int i = 0; i < msg.length(); i += segmentSize) {
    // ensure we don't try to access out of bounds indexes
    int lastIndex = Math.min(msg.length(), i+segmentSize);
    int segmentNumber = i/segmentSize;
    segments[segmentNumber] = msg.substring(i, lastIndex);
}

這會將分段放入該名稱的數組中。 Math.min(msg.length(), i+segmentSize)確保您不會嘗試將字符Math.min(msg.length(), i+segmentSize)字符串末尾之外,這就是導致您提到的StringIndexOutOfBounds錯誤的原因。

您可以根據需要執行其他操作,而不是將其放入數組中。 如果您的最終目標是將更長的字符串合並到這些段中,則可以在for循環之外創建一個StringBuilder(例如在聲明segments數組的位置),然后可以根據需要在循環內追加到該字符串並訪問結果循環之后(即sb.toString() ),而不在每次循環迭代時都創建StringBuilder的新實例。

這是使用Java8流的簡潔實現:

String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";
final AtomicInteger counter = new AtomicInteger(0);
Collection<String> strings = fullText.chars()
                                    .mapToObj(i -> String.valueOf((char)i) )
                                    .collect(Collectors.groupingBy(it -> counter.getAndIncrement() / 6
                                                                ,Collectors.joining()))
                                    .values();

輸出:

[Syria , offici, ally k, nown a, s the , Syrian,  Arab , Republ, ic, is,  a cou, ntry i, n West, ern As, ia...]

您還可以使用正則表達式分割第n個字符,每6個字符精確分割一次

String s ="anldhhdhdhhdhdhhdhdhdhdhdhd";
String[] str = s.split("(?<=\\G.{6})");
System.out.println(Arrays.toString(str));

輸出:

[anldhh, dhdhhd, hdhhdh, dhdhdh, dhd]

為什么不使用本質上以6為增量迭代的while循環,直到剩下少於6個字符?

我不確定您如何使用這些細分,因此現在我只剩下與您提供的預期示例輸出類似的打印語句:

public class StringSegmenter {

    private static final int SEG_LENGTH = 6;
    private static final String PREFIX = "Segment%s: %s\n";

    public static void main(String[] args) {
        String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";

        int position = 0;
        int length = fullText.length();
        int segmentationCount = 0;

        // Checks that remaining characters are greater than 6, then prints segment
        // If less than 6 characters remain, prints remainder and exits loop.
        while (position < length) {
            segmentationCount++;

            if ((length - position) < SEG_LENGTH) {

                // Replace this with logging, or StringBuilder appending, etc...
                System.out.printf(PREFIX, segmentationCount, fullText.substring(position, length - 1));
                break;
            }
            // Replace this with logging, or StringBuilder appending, etc...
            System.out.printf(PREFIX, segmentationCount, fullText.substring(position, position + SEG_LENGTH));
            position += SEG_LENGTH;
        }
    }
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM