Problem Statement
Given a string s, find the length of the longest sub-string without duplicate characters.
Examples
- For string “abcabcbb”, longest such sub-string is of length 3. For this, we can have multiple answers such as “abc”, “bca”, “cab”.
- For string “bbbbb”, longest such sub-string is clearly of length 1, as all characters are the same.
- For string “pwwkew”, longest such sub-string is of length 3. Like “wke”, “kew”.
Solution
Points of Interest
- Characters before and after the longest sub-string should be already present in the sub=string. This is true, because if there was a new character, then that sub-string cannot be the longest sub-string.
- Let’s try to solve it using brute force method. In here we consider all sub-strings and find the max length. Let’s note down few observations
- There will be a lot of duplicate calculations. Let’s say for “abcaxfaxa” string we are looking at “bcaxf”.
- If we move end one position and consider “bcaxfa”, then “a” is duplicate.
- Notice that, with this info we need not consider “caxfa” and “afa” substrings.
Algo
We will use Sliding window method to eliminate duplications mentioned above.
- Initialize start and end with 0. These values represents the boundaries of the current window (both inclusive).
- first_i is a hash-nap that will store the first index of a character in the current window. Populate it with the first character of the string.
- Initialize max_len = 1. This will store the maximum length sub-string seen so far that matches our requirements.
- Start an indefinite loop.
- We will make sure that any window at this stage has no duplicates. So update max_len with current window size, if max_len is smaller.
- Move end a step in string.
- If end is greater than or equal to length of string, then break from loop.
- If you observe that end character is already in first_i, then move start to a point just after the index of end character present in the current sub-string. Update first_i accordingly.
- Make sure to update the end character index in first_i.
- At this point you will have max_len as the desired result.
It is possible that string can be empty or None. So handle this case early in the code
Code
def lengthOfLongestSubstring(s: str) -> int:
if not s:
return 0
start, end, first_i, max_len = 0, 0, {s[0]: 0}, 1
while True:
max_len = max(max_len, (end - start + 1))
end += 1
if end >= len(s):
break
if s[end] in first_i:
stop = first_i[s[end]]
while start <= stop:
first_i.pop(s[start])
start += 1
first_i[s[end]] = end
return max_len
Complexity
- Even though we have 2 loops, our start and end visits each character only once.
- Most of the memory will be used by first_i variable. In worst case scenario, all characters of the string are unique.