简体   繁体   中英

Cracking the coding interview anagram: unique character variable check

I have a specific question regarding the "Anagram" problem in Cracking the coding interview. In my code below, the method 'anagram' is the one which is from the book and anagram2 is my own logic(which is missing the unique char tracker and total number tracker.) Since we are ensuring that both strings are of equal length, I assumed count of each char is all we needed to keep track of. For the inputs that I have made, I see identical answers for my test cases. I want to understand what testcases I am missing here that warrants the additional logic in 'anagram'. Any help is appreciated!

Problem definition - Write a method to decide if two strings are anagrams or not.

public class anagram {
    public static boolean anagram(String s, String t) {

         if (s.length() != t.length()) return false;
         int[] letters = new int[256];
         int num_unique_chars = 0;
         int num_completed_t = 0;
         char[] s_array = s.toCharArray();
         for (char c : s_array) { // count number of each char in s.
             if (letters[c] == 0) ++num_unique_chars;
             ++letters[c];
             }
         for (int i = 0; i < t.length(); ++i) {
             int c = (int) t.charAt(i);
             if (letters[c] == 0) { // Found more of char c in t than in s.
                 return false;
                 }
             --letters[c];
             if (letters[c] == 0) {
                 ++num_completed_t;
                 if (num_completed_t == num_unique_chars) {
                     // it’s a match if t has been processed completely
                     return i == t.length() - 1;
                     }
                 }
             }
         return false;
    }
    public static boolean anagram2(String s, String t) {
        if (s.length() != t.length()) return false;
        int[] letters = new int[256];
        char[] s_array = s.toCharArray();
        for (char c : s_array) { // count number of each char in s.
            ++letters[c];
        }
        for (int i = 0; i < t.length(); ++i) {
            int c = (int) t.charAt(i);
            if (letters[c] == 0) { // Found more of char c in t than in s.
                return false;
            }
            --letters[c];
            if (letters[c] == 0) {
                if (i == t.length() - 1) {
                    // it’s a match if t has been processed completely
                    return i == t.length() - 1;
                }
            }
        }
        return false;
    }

    public static void main(String args[]) {
        System.out.println(anagram("onex","noey"));
        System.out.println(anagram("onex","noey"));
        System.out.println(anagram("onen","noen"));
        System.out.println(anagram("abcde", "abedc"));
        System.out.println(anagram("ababab", "baaabb"));
        System.out.println(anagram("aaaa", "aaaa"));
        System.out.println(anagram2("onen", "noen"));
        System.out.println(anagram2("abcde", "abedc"));
        System.out.println(anagram2("ababab", "baaabb"));
        System.out.println(anagram2("aaaa", "aaaa"));
    }
}

If you are not sure and you have a reference implementation, simply try an exhausting search for a limited domain. I used the String representation of numbers for testing.

private static final int LIMIT = 9999;

public static final void main(final String[] args) {
    for (int i = 0; LIMIT > i; i++) {
        for (int i2 = 0; LIMIT > i2; i2++) {
            final String s = "" + i;
            final String t = "" + i2;
            if (anagram2(s, t) != anagram(s, t)) {
                System.err.println("s: " + s + "  t:" + t);
            }
        }
    }
    System.err.println("end");
}

Your implementation always returns the same result, as the reference implementation. I think it is sure to assume if there is no counter example in this range, there will be none elsewhere too.

by the way. Both implementations are of nearly exact the same speed (at least in my test)

Yes, your method is correct. I have to agree that the method anagrams has a lot of redundancy. Here's even simpler version for anagrams2 :

public static boolean anagram2(String s, String t) {
    if (s.length() != t.length()) return false;
    int[] letters = new int[256];
    char[] s_array = s.toCharArray();
    for (char c : s_array)
        ++letters[c];
    for (int i = 0; i < t.length(); ++i) {
        int c = (int) t.charAt(i);
        if (letters[c] == 0)
            return false;
        --letters[c];
    }
    return true;
}

Here's a piece a code with which you can test your anagram2 versions:

static Random r = new Random();
public static String generateString(int n) {
    StringBuilder sb = new StringBuilder();
    for (int i =  0; i < n; ++i)
        sb.append((char) (r.nextInt(3) + 'a'));
    return sb.toString();
}

static void test(int cases, int stringLength) {
    for (int i = 0; i < cases; ++i) {
        String s = generateString(stringLength);
        String t = generateString(stringLength);
        boolean ans1 = anagram(s, t);
        boolean ans2 = anagram2(s, t);

        if (ans1 != ans2) {
            System.out.printf("TESTCASE %d: FAIL\n", i+1);
            System.out.printf("%b %b\n", ans1, ans2);
            System.out.printf("%s %s\n", s, t);
            return;
        } else {
            System.out.printf("TESTCASE %8d: OK\n", i + 1);
        }
    }
}

To test your code just call

test(10000, 3);

depending on how many testcase you want to run with how long strings. Don't run it on long strings since the chance of create a anagram pair is small. Length 3 seems reasonable.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM