简体   繁体   中英

Best way to perfectly hash a three letter lowercase String in Java?

My current solution uses a multi-dimensional array, does a simpler solution exist? I want to access the hashed objects in O(1) time and want to make best use of memory space hence the need for a perfect hash.

public final class PerfectHash {

private Object[][][] hashtable = new Object[26][26][26];

public void storeObjectAgainst3letterStringKey(Object o, String s){

    int[] coord = stringToCoord(s);
    hashtable[coord[0]][coord[1]][coord[2]] = o;

}

public Object get(String s){
    int[] coord = stringToCoord(s);
    return hashtable[coord[0]][coord[1]][coord[2]];
}

private int[] stringToCoord(String s){
    if (!s.matches("[a-z][a-z][a-z]")){
        throw new IllegalStateException("invalid input, expecting 3 alphabet letters");
    }
    // 1-26
    // 1-26
    // 1-26
    String lowercase = s.toLowerCase();

    // 97-122 integers for lower case ascii
    int[] coord = new int[3];
    for (int i=0;i<lowercase.length();++i){
        int ascii = (int)lowercase.charAt(i);
        int alpha = ascii - 97; // 0-25     
        coord[i] = alpha;
    }
    return coord;
}
}

You don't even need to convert the String first. If your three characters are lower case, you can do this.

public static int hashFor(String s) {
    assert s.length() == 3 && isLower(s.charAt(0)) && isLower(s.charAt(1)) && isLower(s.charAt(2));

    return ((s.charAt(0) - 'a') * 26 + s.charAt(1) - 'a') * 26 + s.charAt(2) - 'a';
}

// check a-z not all lowercase letters.
public static boolean isLower(char ch) {
    return ch >= 'a' && ch <= 'z';
}

a slightly more optimise version is

public static int hashFor(String s) {
    return s.charAt(0) * (26 * 26) + s.charAt(1) * 26 + s.charAt(2) - ('a' * (26*26+26+1));
}

The calculations with only numbers will be optimised by the compiler.

BTW Using matches() is likely to be 100x slower than everything else. ;)

You don't need to convert to lower case if you have already determined it has to be in lowercase.

You could just use a single dimensional array instead of a 3 dimensional array.

Then add a function

public Object get(String s){
    int[] coord = stringToCoord(s);
    int hashindex = (coord[0]*26 + coord[1])*26 + coord[2];
    return hashtable[hashindex];
}

Also, look into trie data structures, they are useful for efficient string look-up.

The only thing which might be more efficient, is directly mapping your strings to a single hash value and doing lookup in a one-dimensional array:

public final class PerfectHash {
  private Object[] hashtable = new Object[26*26*26];
  private int getHash(String s) {
      char a = s.charAt(0) - 'a', b = s.charAt(1) - 'a', c = s.charAt(2) - 'a';
      if(s.length() != 3 || a >= 26 || b >= 26 || c >= 26)
        throw new IllegalStateException("invalid input, expecting 3 alphabet letters");
      return (a*26+b)*26+c;
  }
  public object get(String s) {return hashtable[getHash(s)];}
  public void set(String s, Object o) {hashtable[getHash(s)] = o;}
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM