Top k Largest Numbers

Question

Given an integer array, find the top k largest numbers in it.

Example

Given [3,10,1000,-99,4,100] and k = 3. Return [1000, 100, 10].

Analysis

此题运用Priority Queue,也就是Max Heap来求解十分简洁。重点在于对该数据结构的理解和应用。 使用Max Heap,也就是保证了堆顶的元素是整个堆最大的那一个。第一步,先构建一个capacity为k的Max Heap,这需要O(n)时间;第二步,对这个Max Heap进行k次poll()操作,丛中取出k个maximum的元素,这一步操作需要O(k logn)。因此最终整个操作的时间复杂度是 O(n + k logn),空间复杂度是O(k)

对于Top k largest(or smallest) elements in an array问题,有如下几种常见方法:

  1. 冒泡排序k次

  2. 使用临时数组

  3. 使用排序

  4. 使用最大堆

  5. 使用排序统计

  6. 使用最小堆

(摘自GeeksForGeeks)

Method 1. Use bubble k times

Thanks to Shailendra for suggesting this approach.

1) Modify Bubble Sort to run the outer loop at most k times.

2) Print the last k elements of the array obtained in step 1.

Time Complexity: O(nk)

Like Bubble sort, other sorting algorithms like Selection Sort can also be modified to get the k largest elements.

Method 2. Use temporary array

K largest elements from arr[0..n-1]

1) Store the first k elements in a temporary array temp[0..k-1].

2) Find the smallest element in temp[], let the smallest element be min.

3) For each element x in arr[k] to arr[n-1] If x is greater than the min then remove min from temp[] and insert x.

4) Print final k elements of temp[]

Time Complexity: O((n-k) k). If we want the output sorted then O((n-k) k + klogk)

Method 3. Use sorting

1) Sort the elements in descending order in O(nLogn)

2) Print the first k numbers of the sorted array O(k).

Time complexity: O(nlogn)

Method 4. Use Max heap

1) Build a Max Heap tree in O(n)

2) Use Extract Max k times to get k maximum elements from the Max Heap O(klogn)

Time complexity: O(n + klogn)

Method 5. Use Order Statistics

1) Use order statistic algorithm to find the kth largest element. Please see the topic selection in worst-case linear time O(n) 2) Use QuickSort Partition algorithm to partition around the kth largest number O(n). 3) Sort the k-1 elements (elements greater than the kth largest element) O(kLogk). This step is needed only if sorted output is required.

Time complexity: O(n) if we don’t need the sorted output, otherwise O(n+kLogk)

Thanks to Shilpi for suggesting the first two approaches.

Method 6. Use Min Heap

This method is mainly an optimization of method 2 temp array. Instead of using temp[] array, use Min Heap.

Thanks to geek4u for suggesting this method.

1) Build a Min Heap MH of the first k elements (arr[0] to arr[k-1]) of the given array. O(k)

2) For each element, after the kth element (arr[k] to arr[n-1]), compare it with root of MH. ……a) If the element is greater than the root then make it root and call heapify for MH ……b) Else ignore it.

The step 2 is O((n-k) * logk)

3) Finally, MH has k largest elements and root of the MH is the kth largest element.

Time Complexity: O(k + (n-k)Logk) without sorted output. If sorted output is needed then O(k + (n-k)Logk + kLogk)

All of the above methods can also be used to find the kth largest (or smallest) element.

Solution

Max Heap

class Solution {
    /*
     * @param nums an integer array
     * @param k an integer
     * @return the top k largest numbers in array
     */
     public int[] topk(int[] nums, int k) {
         Comparator<Integer> comparator = new Comparator<Integer>() {
             @Override
             public int compare(Integer o1, Integer o2) {
                 if(o1 < o2) {
                     return 1;
                 } else if(o1 > o2) {
                     return -1;
                 } else {
                     return 0;
                 }
             }
         };
         PriorityQueue<Integer> minheap = new PriorityQueue<Integer>(k, comparator);

         for (int i : nums) {
             minheap.add(i);
         }

         int[] result = new int[k];
         for (int i = 0; i < result.length; i++) {
             result[i] = minheap.poll();
         }
         return result;
    }
};

Min Heap

class Solution {
    /*
     * @param nums an integer array
     * @param k an integer
     * @return the top k largest numbers in array
     */
    public int[] topk(int[] nums, int k) {
        PriorityQueue<Integer> minheap = new PriorityQueue<Integer>();
        for (int i = 0; i < k; i++) {
           minheap.offer(nums[i]);
        }
        for (int i = k; i < nums.length; i++) {
            if (nums[i] > minheap.peek()) {
                minheap.poll();
                minheap.offer(nums[i]);
            }
        }
        Iterator it = minheap.iterator();
        List<Integer> result = new ArrayList<Integer>();
        while(it.hasNext()) {
            result.add((Integer) it.next());
        }
        Collections.sort(result, Collections.reverseOrder());

        int[] res = new int[result.size()];
        Iterator<Integer> iter = result.iterator();
        for (int i = 0; i < res.length; i++) {
            res[i] = iter.next().intValue();
        }
        return res;
    }
};

Sorting

class Solution {
    /*
     * @param nums an integer array
     * @param k an integer
     * @return the top k largest numbers in array
     */
    public int[] topk(int[] nums, int k) {
        Arrays.sort(nums);
        int[] result = new int[k];
        int j = 0;
        for (int i = nums.length - 1; i > nums.length - k - 1; i--) {
            result[j] = nums[i];
            j++;
        }
        return result;
    }
};

Reference

Last updated