简体   繁体   English

如何遍历/导航字符集中的每个字符(例如,US-ASCII或IBM037,按正确的顺序)?

[英]How to iterate/navigate through each character in a character set (e.g., US-ASCII or IBM037, in proper sequence)?

I want to iterate through each character in a character set (primarily US-ASCII and IBM037) and then print all alphanumeric characters (or all printable characters) in the proper character set sequence. 我想遍历字符集中的每个字符(主要是US-ASCII和IBM037),然后以正确的字符集顺序打印所有字母数字字符(或所有可打印字符)。 Is it possible without creating static arrays? 是否可以不创建静态数组?

Try the following to print all the valid characters in the order of the encoded values. 尝试以下操作以编码值的顺序打印所有有效字符。

public static void main(String... args) {
    printCharactersFor("US-ASCII");
    printCharactersFor("IBM037");
}

private static void printCharactersFor(String charsetName) {
    System.out.println("Character set map for " + charsetName);
    Charset charset = Charset.forName(charsetName);
    SortedMap<BigInteger, String> charsInEncodedOrder = new TreeMap<BigInteger, String>();
    for (int i = Character.MIN_VALUE; i < Character.MAX_VALUE; i++) {
        String s = Character.toString((char) i);
        byte[] encoded = s.getBytes(charset);
        String decoded = new String(encoded, charset);
        if (s.equals(decoded))
            charsInEncodedOrder.put(new BigInteger(1, encoded), i + " " + s);
    }
    for (Map.Entry<BigInteger, String> entry : charsInEncodedOrder.entrySet()) {
        System.out.println(entry.getKey().toString(16) + " " + entry.getValue());
    }
}

and it produces something which matches http://www.fileformat.info/info/charset/IBM037/grid.htm 并产生与http://www.fileformat.info/info/charset/IBM037/grid.htm匹配的内容

This worked for me. 这对我有用。 Thanks for all the feedback! 感谢您的所有反馈!

final Charset charset = Charset.forName(charsetName);
for (int i = 0; i < 255; i++) {
    ByteBuffer bb = ByteBuffer.allocate(4);
    bb.putInt(i);
    System.out.println(new String(bb.array(), charset).trim());
}

Iterate over the values from 0 to 127 (or 255 ) and decode them using the wanted character set, resulting in “normal” java char values. 迭代从0127 (或255 )之间的值,并使用所需的字符集它们进行解码 ,从而得到“正常”的Java char值。 Now you can check for the alphanumericality of those values using Character.isLetterOrDigit(char) and use them to your liking. 现在,您可以使用Character.isLetterOrDigit(char)检查这些值的字母数字性,并根据需要使用它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM