简体   繁体   中英

Short, case-insensitive string obfuscation strategy

I am looking for a way to identify (ie encode and decode) a set of Java strings with one token. The identification should not involve DB persistence. So far I have looked into Base64 encoding and DES encryption, but both are not optimal with respect to the following requirements:

  • Token should be as short as possible
  • Token should be insensitive to casing
  • Token should survive a URLEncoder/Decoder round-trip (ie will be used in URLs)

Is Base32 my best shot or are there better options? Note that I'm primarily interested in shortening & obfuscating the set, encryption/security is not important.

What's a structure of the text (ie set of strings)? You could use your knowledge of it to encode it in a shorten form. Eg if you have large base-decimal number "1234567890" you could translate it into 36-base number, which will be shorter.

Otherwise it looks like you are trying invent an universal archiver.

If you don't care about length, then yes, processing by alphabet based encoder (such as Base32) is the only choice.

Also, if text is large enough, maybe you could save some space by gzipping it.

Rot13 obfuscates but does not shorten. Zip shortens (usually) but does not survive the URL round trip. Encryption will not shorten, and may lengthen. Hashing shortens but is one-way. You do not have an easy problem. Base32 is case insensitive, but takes more space than Base64, which isn't. I suspect that you are going to have to drop or modify your requirements. Which requirements are most important and which least important?

I have spent some time on this and I have a good solution for you.

Encode as base64 then as a custom base32 that uses 0-9a-v. Essentially, you lay out the bits 6 at a time (your chars are 0-9a-zA-Z) then encode them 5 at a time. This leads to hardly any extra space. For example, ABCXYZdefxyz123789 encodes as i9crnsuj9ov1h8o4433i14

Here's an implementation that works, including some test code that proves it is case-insensitive:

// Note: You can add 1 more char to this if you want to
static String chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

private static String decodeToken(String encoded) {
    // Lay out the bits 5 at a time
    StringBuilder sb = new StringBuilder();
    for (byte b : encoded.toLowerCase().getBytes())
        sb.append(asBits(chars.indexOf(b), 5));

    sb.setLength(sb.length() - (sb.length() % 6));

    // Consume it 6 bits at a time
    int length = sb.length();
    StringBuilder result = new StringBuilder();
    for (int i = 0; i < length; i += 6)
        result.append(chars.charAt(Integer.parseInt(sb.substring(i, i + 6), 2)));

    return result.toString();
}

private static String generateToken(String x) {
    StringBuilder sb = new StringBuilder();
    for (byte b : x.getBytes())
        sb.append(asBits(chars.indexOf(b), 6));

    // Round up to 5 bit multiple
    // Consume it 5 bits at a time
    int length = sb.length();
    sb.append("00000".substring(0, length % 5));
    StringBuilder result = new StringBuilder();
    for (int i = 0; i < length; i += 5)
        result.append(chars.charAt(Integer.parseInt(sb.substring(i, i + 5), 2)));

    return result.toString();
}

private static String asBits(int index, int width) {
    String bits = "000000" + Integer.toBinaryString(index);
    return bits.substring(bits.length() - width);
}

public static void main(String[] args) {
    String input = "ABCXYZdefxyz123789";
    String token = generateToken(input);
    System.out.println(input + " ==> " + token);
    Assert.assertEquals("mixed", input, decodeToken(token));
    Assert.assertEquals("lower", input, decodeToken(token.toLowerCase()));
    Assert.assertEquals("upper", input, decodeToken(token.toUpperCase()));
    System.out.println("pass");
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM