Anagram of 2 string : I don't understand what is the problem with my code

Question

I have my code which return the smallest integer deletions required to make anagram:

#include <bits/stdc++.h>

using namespace std;

int makeAnagram(string a, string b) {
    int count = 0;
    for(auto it=a.begin(); it!=a.end(); it++){
        if(find(b.begin(), b.end(), *it) == b.end()){
            a.erase(it);
            count++;
        }
    }
    for(auto it = b.begin(); it != b.end(); it++){
        if(find(a.begin(), a.end(), *it) == a.end()){
            b.erase(it);
            count++;
        }
    }
    return count;
}

And it doesn't work at all, I don't understand why, the main test is:

int main()
{

    string a={'a','b','c'};

    string b={'c','d','e'};

    int res = makeAnagram(a, b);

    cout << res << "\n";

    return 0;
}

The console is supposed to return 4, but it return 2 instead, and the string a and b have 2 elements at the end of the program, when they should are 1-sized

Answer 1

Problem with your approach is you are deleting the element during the iteration but your not considering the change in the iterator i,e you should first increment iterator then delete the previous element here is simple approach

int makeAnagram(string a, string b) {
    int A = a.size();
    int B = b.size();
    int count = 0;
    if (A > B)
    {
        for (auto i = b.begin(); i != b.end(); i++)
        {
            size_t t = a.find(*i);
            if (t == std::string::npos)
            {
                count++;
            }
            else
            {
                a.erase(a.begin() + t);
            }
            
        }
       
        count = count + A - (b.size() - count);
        
    }
    else
    {for (auto i = a.begin(); i != a.end(); i++)
        {
            size_t t = b.find(*i);
            if (t == std::string::npos)
            {
                count++;
            }
            else
            {
                b.erase(b.begin() + t);
            }
            
        }
       
        count = count + B - (a.size() - count);
    }
    return count;
}

Answer 2

Hm, I thought that I answered this question already somewhere else. But anyway. Lets try again. Important is the algorithm. And I nearly doubt that there is a faster answer than mine below. But, we never know. . .

And, as always, the most important thing is to find a good algorithm. And then, we maybe can do some good coding to get a fast solution. But most important is the algorithm.

Let's start to think about it. Let's start with 2 simple strings

abbccc
abbccc

They are identical, so nothing to erase. Result is 0. But how can we come to this conclusion? We could think of sorting, searching, comparing character by character, but the correct approach is counting the occurence of characters. That is nealy everytime done when talking about Anagrams. So, here we have for each string 1 a, 2 b, 3c.

And if we compare the counts for each character in the strings, then they are the same.

If we remember our -long time ago- school days or C-code, or even better Micro Controller assembler codes, then we know that comparing can be done by subtracting. Example. Let us look at some examples: 6-4=2 or 3-4= -1 or 7-7=0. So, that approach can be used.

Next example for 2 strings:

 bbcccddd
abbccc

We already see by looking at it that we need to delete 3*"d" from the first string and one "a" from the second string. Overall 4 deletions. Let's look at the counts: String a: b->2, c->3 d->3, String b: a->1, b->2, c->3

And, now let's compare, so subtract: a->0-1= -1, b->2-2=0, c->3-3=0, d->3-0=3.

And if we add up the absolute values of the deltas, then we have the result. 3+abs(-1)=4

OK, now, we can start to code this algorithm.

Read 2 source strings a and b from std::cin . For this we will use std::getline
Next we define a "counter" as an array. We assume that a char is 8bit wide and with that the maximum number of characters is 256
We positively count all character occurences of the first string
Now we do the comparison and counting in one step, by decrementing the counter for each occurence of a character in the 2nd string
Then we accumulate all counters (for all occurences of characters). We use the absolute value, because numbers could be negative.

Then we have the result.

Please note, you would need an array size of 26 counters only, because the requirements state an input range for 'a'-'z' for the charachters of the strings. But then we would need to map the charachter values for 'a'-'z' to indices 0-25, by subtracting always 'a' from a character. But with a little bit waste of space (230bytes), we can omit the subtraction.

Please see:

#include <iostream>
#include <string>

int main() {

    // Here we will store the input, 2 strings to check
    std::string a{}, b{};

    // Read the strings from std::cin
    std::getline(std::cin, a);
    std::getline(std::cin, b);

    // Here we will count the occurence of characters. 
    //We assume a char type with a width of 8 bit
    int counter[256]{};

    // Count occurence of characters in string a
    // And Count occurence of characters in string b negatively
    for (const char c : a) ++counter[c];
    for (const char c : b) --counter[c];

    // Calculate result
    int charactersToDeleteForAnagram{};
    for (int c : counter) charactersToDeleteForAnagram += std::abs(c);

    std::cout << charactersToDeleteForAnagram << '\n';

    return 0;
}

We can also convert to C++, where we use input checking, a std::unordered_map for counting and std::accumulate for summing up. Also the internal representation of a char-type doesn'matter. And the principle is the same.

I do not know, if this is that much slower. . .

Please see:

#include <iostream>
#include <string>
#include <unordered_map>
#include <numeric>

int main() {

    // Here we will store the input, 2 strings to check
    std::string aString{}, bString{};

    // Read the strings from std::cin
    if (std::getline(std::cin, aString) && std::getline(std::cin, bString)) {

        // Here we will count the occurence of characters. 
        //We assume a char type with a width of 8 bit
        std::unordered_map<char, int> counter{};

        // Count occurence of characters in string a
        // And Count occurence of characters in string b negatively
        for (const char character : aString) counter[character]++;
        for (const char character : bString) counter[character]--;

        // Calculate result and show to user
        std::cout << std::accumulate(counter.begin(), counter.end(), 0U, 
            [](size_t sum, const auto& counter) { return sum + std::abs(counter.second); }) << '\n';
    }
    else std::cerr << "\nError: Problem with input\n";
    return 0;
}

If you should have any question, then please ask.

Language: C++ 17

Compiled and tested with MS Visual Studio 2019 Community Edition

Anagram of 2 string : I don't understand what is the problem with my code

Question

2 answers

solution1
1 ACCPTED 2020-08-14 10:39:37

solution2
1 2020-08-14 13:11:58

Anagram of 2 string : I don't understand what is the problem with my code

Question

2 answers

solution1 1 ACCPTED 2020-08-14 10:39:37

solution2 1 2020-08-14 13:11:58

solution1
1 ACCPTED 2020-08-14 10:39:37

solution2
1 2020-08-14 13:11:58