Longest Consecutive Sequence

Union Find, HashMap, HashSet, Multiple Solution

Question

Given an unsorted array of integers, find the length of the longest consecutive elements sequence.

Clarification

Your algorithm should run in O(n) complexity.

Example

Given [100, 4, 200, 1, 3, 2],

The longest consecutive elements sequence is [1, 2, 3, 4]. Return its length: 4.

Analysis

Leetcode 官方解法：https://leetcode.com/problems/longest-consecutive-sequence/solution/

几种思路:

对于每一个n，可以检查n-1, n+1是否存在于这个set（或者map）中；对于map的每个操作都是O(1)的，所以最终是O(n);
对于每一个n，检查是否为一个consecutive sequence的下边界，也就是n-1不存在于set中，再逐次检查n + 1, n + 2, n + 3...是否在set中，最终得到另一个上边界（+1）m，所以sequence的长度为m - n （也可以是n+1不存在于set中，则反向检查）。转为set，时间O(n)，之后对于set中的每一个元素，如果是一个连续序列的下边界，则对这个连续序列进行，因为对于每一个连续序列实际只会扫描一遍，所以这个循环最终是O(n)时间复杂度的。
先排序，再依次扫描有序数组元素就可以得到最长的连续序列。缺点在于排序一般认为O(nlogn)，不太满足题中对O(n)的时间复杂度要求，但是优点在于空间可能为O(1) (如果用in place的排序算法)

另外，从题目对于时间复杂度的要求O(n)，可以推测那么解法可能是不可以是多重循环，但是思路2其实正是多重循环，但在于第二重循环并非每次都执行，而且执行的次数最多为最长连续序列的长度。

HashMap

对于第一种思路，具体的解释如下：https://leetcode.com/discuss/18886/my-really-simple-java-o-n-solution-accepted

Whenever a new element n is inserted into the map, do two things:

See if n - 1 and n + 1 exist in the map, and if so, it means there is an existing sequence next to n. Variables left and right will be the length of those two sequences, while 0 means there is no sequence and n will be the boundary point later. Store (left + right + 1) as the associated value to key n into the map.
Use left and right to locate the other end of the sequences to the left and right of n respectively, and replace the value with the new length.

Everything inside the for loop is O(1) so the total time is O(n)

第二种思路来源：https://leetcode.com/discuss/38619/simple-o-n-with-explanation-just-walk-each-streak

Solution

O(n) HashMap - store sequence length in the boundary points of the sequence

The key thing is to keep track of the sequence length and store that in the boundary points of the sequence. For example, as a result, for sequence {1, 2, 3, 4, 5}, map.get(1) and map.get(5) should both return 5.

public int longestConsecutive(int[] num) {
    int res = 0;
    HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
    for (int n : num) {
        if (!map.containsKey(n)) {
            int left = (map.containsKey(n - 1)) ? map.get(n - 1) : 0;
            int right = (map.containsKey(n + 1)) ? map.get(n + 1) : 0;
            // sum: length of the sequence n is in
            int sum = left + right + 1;
            map.put(n, sum);

            // keep track of the max length
            res = Math.max(res, sum);

            // extend the length to the boundary(s)
            // of the sequence
            // will do nothing if n has no neighbors
            map.put(n - left, sum);
            map.put(n + right, sum);
        }
        else {
            // duplicates
            continue;
        }
    }
    return res;
}

Another implementation

    public int longestConsecutive(int[] nums) {
        Map<Integer,Integer> ranges = new HashMap<>();
        int max = 0;
        for (int num : nums) {
            if (ranges.containsKey(num)) continue;

            // 1.Find left and right num
            int left = ranges.getOrDefault(num - 1, 0);
            int right = ranges.getOrDefault(num + 1, 0);
            int sum = left + right + 1;
            max = Math.max(max, sum);

            // 2.Union by only updating boundary
            // Leave middle k-v dirty to avoid cascading update
            if (left > 0) ranges.put(num - left, sum);
            if (right > 0) ranges.put(num + right, sum);
            ranges.put(num, sum); // Keep each number in Map to de-duplicate
        }
        return max;
    }

* O(n) TIme: Convert to set, loop lower bound consecutive sequence

(10ms - 51.32% AC) HashSet and Intelligent Sequence Building

We only attempt to build sequences from numbers that are not already part of a longer sequence. This is accomplished by first ensuring that the number that would immediately precede the current number in a sequence is not present, as that number would necessarily be part of a longer sequence.

只在找到potential连续sequence的最左端才开始寻找，当前连续sequence的长度；虽然有内外两层循环，但是每个元素最多只会遍历一次，因此时间复杂度还是O(n). 建立HashSet的时间 O(n), 空间O(n).

public class Solution {
    /**
     * @param nums: A list of integers
     * @return an integer
     */
    public int longestConsecutive(int[] nums) {
        // write you code here
        Set<Integer> hs = new HashSet<Integer>();
        for (int n : nums) {
            hs.add(n);
        }
        int longest = 0;
        for (int n : hs) {
            if (!hs.contains(n - 1)) {
                int m = n + 1;
                while (hs.contains(m)) {
                    m++;
                }
                longest = Math.max(longest, m - n);
            }
        }
        return longest;
    }
}

*HashSet - (7ms 86.82% AC) LeetCode Official Solution

HashSet and Intelligent Sequence Building

class Solution {
    public int longestConsecutive(int[] nums) {
        if (nums == null || nums.length == 0) return 0;
        Set<Integer> numSet = new HashSet<Integer>();
        for (int num: nums) {
            numSet.add(num);
        }
        int longestStreak = 1;
        for (int num: numSet) {
            int currentStreak = 1;
            int currentNum = num;
            if (!numSet.contains(num - 1)) {
                while (numSet.contains(currentNum + 1)) {
                    currentStreak++;
                    currentNum++;
                }
                longestStreak = Math.max (currentStreak, longestStreak);
            }
        }
        return longestStreak;
    }
}

HashSet - Convert to set, expand left, right index and remove from set

HashSet O(n) solution runtime 5 ms, faster than 91.70% via @davidluoyes

public int longestConsecutive(int[] nums) {
    if(nums == null || nums.length == 0) return 0;
    Set<Integer> set = new HashSet<>();
    for(int i : nums) set.add(i);
    int ans = 0;
    for(int num : nums) {
        int left = num - 1;
        int right = num + 1;
        while(set.remove(left)) left--;
        while(set.remove(right)) right++;
        ans = Math.max(ans,right - left - 1);
        if(set.isEmpty()) return ans;//save time if there are items in nums, but no item in hashset.
    }
    return ans;
}

Sorting First - (4ms 94.51% AC)

Time complexity : O(nlgn). The main for loop does constant work nn times, so the algorithm's time complexity is dominated by the invocation of sort, which will run in O(nlgn) time for any sensible implementation.

Space complexity : O(1) (or O(n)). Depending on whether we can modify the input array with sorting the input array in place. If not, we must spend linear space to store a sorted copy.

class Solution {
    public int longestConsecutive(int[] nums) {
        if (nums.length == 0) {
            return 0;
        }

        Arrays.sort(nums);

        int longestStreak = 1;
        int currentStreak = 1;

        for (int i = 1; i < nums.length; i++) {
            if (nums[i] != nums[i-1]) {
                if (nums[i] == nums[i-1]+1) {
                    currentStreak += 1;
                }
                else {
                    longestStreak = Math.max(longestStreak, currentStreak);
                    currentStreak = 1;
                }
            }
        }

        return Math.max(longestStreak, currentStreak);
    }
}

Union Find - (9ms 64.18% AC)

public class Solution {
        public int longestConsecutive(int[] nums) {
            UF uf = new UF(nums.length);
            Map<Integer,Integer> map = new HashMap<Integer,Integer>(); // <value,index>
            for(int i=0; i<nums.length; i++){
                if(map.containsKey(nums[i])){
                    continue;
                }
                map.put(nums[i],i);
                if(map.containsKey(nums[i]+1)){
                    uf.union(i,map.get(nums[i]+1));
                }
                if(map.containsKey(nums[i]-1)){
                    uf.union(i,map.get(nums[i]-1));
                }
            }
            return uf.maxUnion();
        }
    }

    class UF{
        private int[] list;
        public UF(int n){
            list = new int[n];
            for(int i=0; i<n; i++){
                list[i] = i;
            }
        }

        private int root(int i){
            while(i!=list[i]){
                list[i] = list[list[i]];
                i = list[i];
            }
            return i;
        }

        public boolean connected(int i, int j){
            return root(i) == root(j);
        }

        public void union(int p, int q){
          int i = root(p);
          int j = root(q);
          list[i] = j;
        }

        // returns the maxium size of union
        public int maxUnion(){ // O(n)
            int[] count = new int[list.length];
            int max = 0;
            for(int i=0; i<list.length; i++){
                count[root(i)] ++;
                max = Math.max(max, count[root(i)]);
            }
            return max;
        }
    }

PreviousContains Duplicate III NextValid Sudoku

Last updated 5 years ago

hashtagQuestion

hashtagAnalysis

hashtagHashMap

hashtagSolution

hashtagO(n) HashMap - store sequence length in the boundary points of the sequence

hashtagAnother implementation

hashtag* O(n) TIme: Convert to set, loop lower bound consecutive sequence

hashtag*HashSet - (7ms 86.82% AC) LeetCode Official Solution

hashtagHashSet - Convert to set, expand left, right index and remove from set

hashtagSorting First - (4ms 94.51% AC)

hashtagUnion Find - (9ms 64.18% AC)