简体   繁体   中英

regular expression to remove all non-printable characters

I wish to remove all non-printable ascii characters from a string while retaining invisible ones. I thought this would work because whitespace, \\n \\r are invisible characters but not non-printable? Basically I am getting a byte array with characters in it and I don't want them to be in it. So i am trying to convert it to a string, remove the characters before using it as a byte array again.

Space works fine in my code now, however now \\r and \\n do not work. What would be the correct regex to retain these also? Or is there a better way that what I am doing?

public void write(byte[] bytes, int offset, int count) {

    try {
        String str = new String(bytes, "ASCII");
        str2 = str.replaceAll("[^\\p{Print}\\t\\n]", "");
        GraphicsTerminalActivity.sendOverSerial(str2.getBytes("ASCII"));

    } catch (UnsupportedEncodingException e) {

        e.printStackTrace();
    }

     return;
 }

} 

EDIT: I tried [^\\x00-\\x7F] which is the range of ascii characters....but then the symbols still get through, weird.

The following regex will only match printable text

[^\x00\x08\x0B\x0C\x0E-\x1F]*

The following Regex will find non-printable characters

[\x00\x08\x0B\x0C\x0E-\x1F]

Jave Code:

boolean foundMatch = false;
try {
    Pattern regex = Pattern.compile("[\\x00\\x08\\x0B\\x0C\\x0E-\\x1F]");
    Matcher regexMatcher = regex.matcher(subjectString);
    foundMatch = regexMatcher.find();
    //Relace the found text with whatever you want
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
}

Here I would prefer a simpler solution. BTW you ignored offset and count. The solution below overwrites the original array.

public void write(byte[] bytes, int offset, int count) {
    int writtenI = offset;
    for (int readI = offset; readI < offset + count; ++readI) {
        byte b = bytes[readI];
        if (32 <= b && b < 127) {
            // ASCII printable:
            bytes[writtenI] = bytes[readI]; // writtenI <= readI
            ++writtenI;
        }
    }
    byte[] bytes2 = new byte[writtenI - offset];
    System.arraycopy(bytes, offset, bytes2, 0, writtenI - offset);
    //String str = new String(bytes, offset, writtenI - offset, "ASCII");
    //bytes2 = str.getBytes("ASCII");
    GraphicsTerminalActivity.sendOverSerial(bytes2);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM