[英]Java String Segmentation at nth position
我在Java中的代碼和我有一個長文本(最多500個字符),並且我想對此文本進行某種細分,例如,在每個細分中我只需要6個字符:這是一個示例文本:
String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";
我想要這個結果:
第1段:敘利亞
segment2:offici
Segment3:盟友k
段n:……
我已經嘗試過for循環,但是沒有達到目標。而且我也遇到了錯誤
java.lang.StringIndexOutOfBoundsException: length=67; regionStart=65; regionLength=5
這是我的代碼:
String msg = fullText; for(int i=-1 ; i <= fullText.length()+1; i++){ int len = msg.length(); text = new StringBuilder().append(msgInfo).append(msg.substring(i, i + 6)).toString(); msg = new StringBuilder().append(msg.substring(i +5, len)).toString(); LogHelper.d(TAG, "teeeeeeeeeeeeext:"+i +" .."+ text); }
我該如何正確進行細分? 謝謝!
您處在正確的軌道上,但是您已經使這一過程變得復雜了。
試試這個
int segmentSize = 6;
String[] segments = new String[msg.length() / segmentSize + 1];
for (int i = 0; i < msg.length(); i += segmentSize) {
// ensure we don't try to access out of bounds indexes
int lastIndex = Math.min(msg.length(), i+segmentSize);
int segmentNumber = i/segmentSize;
segments[segmentNumber] = msg.substring(i, lastIndex);
}
這會將分段放入該名稱的數組中。 Math.min(msg.length(), i+segmentSize)
確保您不會嘗試將字符Math.min(msg.length(), i+segmentSize)
字符串末尾之外,這就是導致您提到的StringIndexOutOfBounds錯誤的原因。
您可以根據需要執行其他操作,而不是將其放入數組中。 如果您的最終目標是將更長的字符串合並到這些段中,則可以在for循環之外創建一個StringBuilder(例如在聲明segments數組的位置),然后可以根據需要在循環內追加到該字符串並訪問結果循環之后(即sb.toString()
),而不在每次循環迭代時都創建StringBuilder的新實例。
這是使用Java8流的簡潔實現:
String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";
final AtomicInteger counter = new AtomicInteger(0);
Collection<String> strings = fullText.chars()
.mapToObj(i -> String.valueOf((char)i) )
.collect(Collectors.groupingBy(it -> counter.getAndIncrement() / 6
,Collectors.joining()))
.values();
輸出:
[Syria , offici, ally k, nown a, s the , Syrian, Arab , Republ, ic, is, a cou, ntry i, n West, ern As, ia...]
您還可以使用正則表達式分割第n個字符,每6個字符精確分割一次
String s ="anldhhdhdhhdhdhhdhdhdhdhdhd";
String[] str = s.split("(?<=\\G.{6})");
System.out.println(Arrays.toString(str));
輸出:
[anldhh, dhdhhd, hdhhdh, dhdhdh, dhd]
為什么不使用本質上以6為增量迭代的while循環,直到剩下少於6個字符?
我不確定您如何使用這些細分,因此現在我只剩下與您提供的預期示例輸出類似的打印語句:
public class StringSegmenter {
private static final int SEG_LENGTH = 6;
private static final String PREFIX = "Segment%s: %s\n";
public static void main(String[] args) {
String fullText = "Syria officially known as the Syrian Arab Republic, is a country in Western Asia...";
int position = 0;
int length = fullText.length();
int segmentationCount = 0;
// Checks that remaining characters are greater than 6, then prints segment
// If less than 6 characters remain, prints remainder and exits loop.
while (position < length) {
segmentationCount++;
if ((length - position) < SEG_LENGTH) {
// Replace this with logging, or StringBuilder appending, etc...
System.out.printf(PREFIX, segmentationCount, fullText.substring(position, length - 1));
break;
}
// Replace this with logging, or StringBuilder appending, etc...
System.out.printf(PREFIX, segmentationCount, fullText.substring(position, position + SEG_LENGTH));
position += SEG_LENGTH;
}
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.