簡體   English   中英

如何按字節修剪字符串?

[英]How to trim a String by bytes?

我有一個UTF-8文本,我希望修剪/截斷它的字節,以便我得到一個新的字符串的服裝長度字節。

public static String trimByBytes(String text, int longitudBytes) throws Exception {

    byte bytes_text[] = text.getBytes("UTF-8");
    int negativeBytes = 0;

    byte byte_trimmed[] = new byte[longitudBytes];
    if (byte_trimmed.length <= bytes_text.length) {
          //copy  array manually and count negativeBytes
        for (int i = 0; i < byte_trimmed.length; i++) {
            byte_trimmed[i] = bytes_text[i];
            if (byte_trimmed[i] < 0) {

                negativeBytes++;
            }
        }
         //if negativeBytes are odd
        if (negativeBytes % 2 != 0 && byte_trimmed[byte_trimmed.length - 1] < 0) {
            byte_trimmed[byte_trimmed.length - 1] = 0;//delete last

        }
    }else{
      for (int i = 0; i < bytes_text.length; i++) {
            byte_trimmed[i] = bytes_text[i];
        }

    }
    return new String(byte_trimmed);
}

}

例如

  • 命名法:String trimByBytes(String str,int lengthOfBytes); trimByBytes(Gómez,1)
  • Gómez長度為6個字節(但長度為5個字符)
  • Gómez修剪為3是GóokGómez修剪為2是G 但是我想要G(刪除奇怪的角色)
  • Gómez修剪為1是GokGómez修剪為8是GGómez

創建一個顯式的CharsetDecoder ,並在其上設置CodingErrorAction.IGNORE

由於CharsetDecoder與ByteBuffers一起使用,因此應用長度限制就像調用ByteBuffer的limit方法一樣簡單:

String trimByBytes(String str, int lengthOfBytes) {
    byte[] bytes = str.getBytes(StandardCharsets.UTF_8);
    ByteBuffer buffer = ByteBuffer.wrap(bytes);

    if (lengthOfBytes < buffer.limit()) {
        buffer.limit(lengthOfBytes);
    }

    CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder();
    decoder.onMalformedInput(CodingErrorAction.IGNORE);

    try {
        return decoder.decode(buffer).toString();
    } catch (CharacterCodingException e) {
        // We will never get here.
        throw new RuntimeException(e);
    }
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM