Longest Non Repeating Substring in c++

Question

I am trying to find the longest substring with no repeated characters. I have a boolean vector to keep track of the 256 ascii characters.

#include <iostream>
#include <cstdio>
#include <string>
#include <algorithm>

using namespace std;

int main()
{
    string s = "aaaaaaaaaaaaasadfrhtytbgrbrsvsrhvsg";
    vector<bool> v(256, false);
    int j = 0, len = 0, index = 0;

    for(int i = 0; i < s.length(); i++)
    {
        if(!v[s[i]])
        {
            j++;

            if(j > len)
            { 
                len = j;
                index = i - j;
            }

            v[s[i]] = true;
        }
        else
        {
            j = 0;
            v.clear();
        }
    }

    cout << s.substr(index, len) + " " << len << endl;
}

I can understand why it gives the output adfrht 6 , whreas the correct output is sadfrhty 8.

Answer 1

The reason why you're getting the wrong result is because the basic approach is missing a few moving pieces. You're not tracking all the information that you need to calculate this. Not only you need to track which characters you have seen, but also at which position in the string they were seen at (I presume that you wish to keep this at O(n) complexity).

This way, when you see a character that's been encountered before, you need to reset the "consecutive non-repeated characters seen so far" counter to begin after the previous occurrence of the same character that you're looking at, in the current position. Additionally, all the characters that were seen so far, until that point, are no longer seen, because if you think about it for a second, it should make sense to you.

That's the missing piece from your implementation. Also, it's not tracking the position of the longest string, quite right.

The following should produce the results you were looking for.

Let us know what grade you received for your homework assignment :-)

#include <iostream>
#include <cstdio>
#include <string>
#include <algorithm>
#include <vector>

using namespace std;

int main()
{
    string s = "aaaaaaaaaaaaasadfrhtytbgrbrsvsrhvsg";
    vector<bool> v(256,false);
    vector<int> seen_at(256);

    int longest_starting_pos=0, longest_length=0;

    int current_length=0;

    for (int i=0; i<s.length(); i++)
    {
        if (v[s[i]])
        {
            for (int j=i-current_length; j<seen_at[s[i]]; ++j)
                v[s[j]]=false;
            current_length=i-seen_at[s[i]]-1;
        }

        v[s[i]]=true;
        seen_at[s[i]]=i;
        if (++current_length > longest_length)
        {
            longest_length=current_length;
            longest_starting_pos=i-current_length+1;
        }
    }

    cout<<s.substr(longest_starting_pos, longest_length)+" "<<longest_length<<endl;
}

Answer 2

Your algorithm is not correct. What is wrong with your algorithm is that once it checks a character, it does not go back to that character to check it again if the substring including that character fails to be the longest. The first s is being checked in the string of length 2 which is as , but when the next a is found, the s is forgotten, even though it could make the next substring longer. Try this code:

#include <iostream>
#include <cstdio>
#include <string>
#include <algorithm>
#include <vector>

using namespace std;

int main()
{
    string s = "aaaaaaaaaaaaasadfrhtytbgrbrsvsrhvsg";
    vector<bool> v(256,false);
    int longStart = 0;
    int longEnd = 0;
    int start = 0

    for (end = 0; end < s.length(); end++)
    {
        if (!v[s[end]])   // if character not already in the substring
        {
            v[s[end]] = true;

            if (end - start > longEnd - longStart)
            {
                longEnd = end;
                longStart = start;
            }
        }
        else   //the character is already in the substring, so increment the
               //start of the substring until that character is no longer in it
        {
            //get to the conflicting character, but don't get to the new character
            while ((s[start] != s[end]) && (start < end))
            {
                start++;
                v[s[start]] = false;
            }

            //remove the conflicting character
            start++;
            //don't set v[s[start]] to false because that character is still
            //encountered, but as the newly appended character, not the first
        }
    }

    longEnd++;    //make longEnd the index after the last character for substring purposes
    cout << s.substr(longStart, longEnd - longStart) + " " << (longEnd - longStart) << endl;
}

Basically what this code does is it keeps a running substring, and whenever it encounters a character that is already in the substring, it increments the start of the substring until that new character is no longer in the substring, then continues as normal. It also checks every time the end is incremented if that substring is longer than the previously believed longest one. This is O(n) I believe as you wanted.

Also, spread your code out. Concise code means nothing if you cannot read it and debug it easily. Also, if you are having issues with your code, work it all out by hand to get a greater understanding of how it all works and what is happening.

Hope this helps!

Longest Non Repeating Substring in c++

Question

2 answers

solution1
4 ACCPTED 2015-06-20 15:57:07

solution2
0 2015-06-20 16:07:41

Longest Non Repeating Substring in c++

Question

2 answers

solution1 4 ACCPTED 2015-06-20 15:57:07

solution2 0 2015-06-20 16:07:41

solution1
4 ACCPTED 2015-06-20 15:57:07

solution2
0 2015-06-20 16:07:41