简体   繁体   中英

Longest Substring with Non-Repeating Character

I am trying to solve the problem of finding the longest substring without a repeating character from a given string.

class Solution(object):
    def lengthOfLongestSubstring(self, s):
        """
        :type s: str
        :rtype: int
        """
        start = 0
        mlen = -1
        cur_ = {}
        cur = 0

        while(start<len(s) and cur<len(s)):
            if s[cur] in cur_ and cur_[s[cur]] >= start:
                if cur - start > mlen:
                    mlen = cur - start
                start = cur_[s[cur]] + 1    
                cur_[s[cur]] = cur            
            else:
                cur_[s[cur]] = cur
                if cur - start > mlen:
                    mlen = cur - start
            cur = cur + 1            
        return mlen

x = Solution()
print(x.lengthOfLongestSubstring("abababcdef"))

I think I am solving it correctly:

Start a new substring when you encounter a repeated character.

But I am not getting the answers right?

In the above example the output is 5 whereas the correct answer is 6.

But for this case:

print(x.lengthOfLongestSubstring("ababa"))

the output is correct ie 2.

Not sure why am I failing that case? Thanks.

I've changed your function a bit to return the longest substring of unique characters instead of just length of it. If you want length - you can always get that from string.

def get_longest_substring(input):
    """
    :type input: str
    :rtype: str
    """

    current = []
    all_substrings = []
    for c in input:
        if c in current:
            all_substrings.append(''.join(current))
            cut_off = current.index(c) + 1
            current = current[cut_off:]
        current += c
    all_substrings.append(''.join(current))

    longest = max(all_substrings, key=len)
    return longest

longest = get_longest_substring("abababcdefc")
print(longest, len(longest))

Code goes through each char building a char array.

If it finds a char already in the array it keeps a copy of the array, cuts off beginning of it up to that character and keeps building it.

At the end it picks longest substring it found and returns it.

You are updating mlen incorrectly in the else branch, you forgot to add current character. Also, you don't need to update mlen when you meet a repetition:

if s[cur] in cur_ and cur_[s[cur]] >= start:
    start = cur_[s[cur]] + 1    
else:
    mlen = max(mlen, cur - start + 1)

cur_[s[cur]] = cur
cur = cur + 1             

I can suggest you this simple algorithm :

1. set all variables to empty.

2. for each letter ch in the string :

2.1. check if ch exists in the dict of the currrent found sub string ?

if it does - check if the cur' sub string longer then the max (maxSubStr initialyzed to "") ?, does - seve the cur' sub string in the max. set the dict of the currrent found sub string with the value ch , and set the current sub string to ch .

if it doen't - add ch to the dict of the currrent found sub string. and concat the cur' substring with ch .

3. return the length of the longest of the current substring and the max.

class Solution(object):

def lengthOfLongestSubstring(self, s):
    """
    :type s: str
    :rtype: int
    """

    curSubStr = ""
    curSubStrDict = {}
    maxSubStr = ""
    for ch in s :
        if ch in curSubStrDict :
            if len(maxSubStr) < len(curSubStr):
                maxSubStr = curSubStr
            curSubStrDict = {}
            curSubStrDict[ch] = ch
            curSubStr = ""+ch

        else :
            curSubStrDict[ch] = ch
            curSubStr += ch

    return len(curSubStr) if len(curSubStr) > len(maxSubStr) else len(maxSubStr)

x = Solution()
print(x.lengthOfLongestSubstring("abcaabccdefgfgh")) # 5 = |cdefg|
print(x.lengthOfLongestSubstring("abababcdef")) # 6 = |abcdef|

just like finding max element in an array, we "iterate over" (not actually iterate) the -substrings without repeating char- and saving the longest.

the iteration happens when we detect a char that contained in the current substring. than we iterate to the next substring.

While you can keep track of your position in the string by incrementing markers at each iteration, it would be simpler to utilize nested for-loops:

s = "abababcdef"
long_run = max({s[i:b] for i in range(len(s)) for b in range(len(s)) if len(set(s[i:b])) == len(s[i:b])}, key=len)

Using set comprehension , the code first finds all possible substrings utilizing a nested for loop. Each for-loop generates all the values in the range of the length of the string, thus removing the need to mutate upper and lower bounds. The comprehension filters all substrings that contain at least one repeating value. To do this, the code uses the set function to remove all duplicates, thus enabling a check for equal lengths of the substring caste to a set and the length of the original substring. Lastly, the max function is applied to the set to find the longest substring in the set via a key of len .

Output:

'abcde'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM