Longest Substring with At Most K Distinct Characters

Question

Given a string s, find the length of the longest substring T that contains at most k distinct characters.

Example

For example, Given s = "eceba", k = 3,

T is "eceb" which its length is 4.

Challenge

O(n), n is the size of the string s.

Tags

String Two Pointers LintCode Copyright Hash Table

Related Problems

Medium Longest Substring Without Repeating Characters

Analysis

窗口型two pointers的又一实例。

需要注意的是内层while循环中的跳出条件,是!map.containsKey(ch_j) && map.size() == k,表明在目前出现了第一个character将使得map.size() > k,因此break。跳出内层while循环时,j - i就是substring的长度,因为j指向了substring的下一个char (假设j指向substring的尾部字符,则应该用j - i + 1计算substring长度)。

Solution

Two Pointers

public class Solution {
    /**
     * @param s : A string
     * @return : The length of the longest substring
     *           that contains at most k distinct characters.
     */
    public int lengthOfLongestSubstringKDistinct(String s, int k) {
        if (s == null || s.length() == 0 || k == 0) {
            return 0;
        }
        HashMap<Character, Integer> map = new HashMap<Character, Integer>();
        int strLength = s.length();
        int i = 0;
        int j = 0;
        int ans = Integer.MIN_VALUE;

        for (i = 0; i < strLength; i++) {
            while (j < strLength) {
                char ch_j = s.charAt(j);
                if (!map.containsKey(ch_j)) {
                    if (map.size() == k) {
                        break;
                    }
                    map.put(ch_j, 1);
                } else {
                    map.put(ch_j, map.get(ch_j) + 1);
                }
                j++;
            }
            ans = Math.max(ans, j - i);
            char ch_i = s.charAt(i);
            if (map.containsKey(ch_i)) {
                if (map.get(ch_i) - 1 == 0) {
                    map.remove(ch_i);
                } else {
                    map.put(ch_i, map.get(ch_i) - 1);
                }
            }

        }

        return ans;
    }
}

(Preferred Implementation)* Sliding Window - Two Pointers (16ms, 48.48%)

O(n) time, O(n) space

辅助HashMap存sliding window中每个distinct character的出现次数,记录元素在窗口中出现次数;

当HashMap中的元素个数大于k时,就需要移动慢指针,并且每次向前移动慢指针时,就给慢指针对应元素的计数减1,如果减为0,则remove慢指针对应元素。直到HashMap中的元素个数 == k。则继续外层循环,移动快指针,加入新的元素,或者更新原有元素的计数。

class Solution {
    public int lengthOfLongestSubstringKDistinct(String s, int k) {
        if (s == null || s.length() == 0 || k == 0) {
            return 0;
        }
        HashMap<Character, Integer> count = new HashMap<>();
        int n = s.length();
        int firstIdx = 0;
        char[] chs = s.toCharArray();

        int longest = 0;

        for (int i = 0; i < n; i++) {
            count.put(chs[i], count.getOrDefault(chs[i], 0) + 1);
            while (count.size() > k) {
                count.put(chs[firstIdx], count.get(chs[firstIdx]) - 1);
                if (count.get(chs[firstIdx]) == 0) {
                    count.remove(chs[firstIdx]);
                }
                firstIdx++;
            }
            longest = Math.max(longest, i - firstIdx + 1);
        }
        return longest;
    }
}

Follow Up

The interviewer may say that the string is given as a steam. In this situation, we can't maintain a "left pointer" as the classical O(n) hashmap solution.

[[https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters/discuss/80044/Java-O(nlogk)-using-TreeMap-to-keep-last-occurrence-Interview-"follow-up"-question!](https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters/discuss/80044/Java-O(nlogk)-using-TreeMap-to-keep-last-occurrence-Interview-"follow-up"-question!](https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters/discuss/80044/Java-O%28nlogk%29-using-TreeMap-to-keep-last-occurrence-Interview-"follow-up"-question!]%28https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters/discuss/80044/Java-O%28nlogk%29-using-TreeMap-to-keep-last-occurrence-Interview-"follow-up"-question!)\)

Instead of recording each char's count, we keep track of char's last occurrence. If you consider k as constant, it is also a O(n) algorithm.

inWindow keeps track of each char in window and its last occurrence position

TreeMap + HashMap - O(nlogk)

   public class Solution {
        public int lengthOfLongestSubstringKDistinct(String str, int k) {
            if (str == null || str.isEmpty() || k == 0) {
                return 0;
            }
            TreeMap<Integer, Character> lastOccurrence = new TreeMap<>();
            Map<Character, Integer> inWindow = new HashMap<>();
            int j = 0;
            int max = 1;
            for (int i = 0; i < str.length(); i++) {
                char in = str.charAt(i);
                while (inWindow.size() == k && !inWindow.containsKey(in)) {
                    int first = lastOccurrence.firstKey();
                    char out = lastOccurrence.get(first);
                    inWindow.remove(out);
                    lastOccurrence.remove(first);
                    j = first + 1;
                }
                //update or add in's position in both maps
                if (inWindow.containsKey(in)) {
                    lastOccurrence.remove(inWindow.get(in));
                }
                inWindow.put(in, i);
                lastOccurrence.put(i, in);
                max = Math.max(max, i - j + 1);
            }
            return max;
        }
    }

Another Implementation based on LinkedHashMap - O(n)

class Solution {
    public int lengthOfLongestSubstringKDistinct(String s, int k) {
        int left = 0;
        int max = 0;
        LinkedHashMap<Character, Integer> map = new LinkedHashMap<>();
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if (map.containsKey(c)) {
                map.remove(c);
            }
            map.put(c, i);
            if(map.size() > k) { 
                char key = map.keySet().iterator().next();
                left = map.get(key) + 1;
                map.remove(key);
            }
            max = Math.max(max, i - left + 1);
        }
        return max;
    }
}

Last updated