Find Median from Data Stream
Question
Numbers keep coming, return the median of numbers at every time a new number added.
Clarification
What's the definition of Median?
Median is the number that in the middle of a sorted array. If there are n numbers in a sorted array A, the median is A[(n - 1) / 2]. For example, if A=[1,2,3], median is 2. If A=[1,19], median is 1.
Example
For numbers coming list: [1, 2, 3, 4, 5], return [1, 1, 2, 2, 3].
For numbers coming list: [4, 5, 1, 3, 2, 6, 0], return [4, 4, 4, 3, 3, 3, 3].
For numbers coming list: [2, 20, 100], return [2, 2, 20].
Challenge
Total run time in O(nlogn).
Tags
LintCode Copyright Heap Priority Queue Google
Related Problems
Hard Sliding Window Median 17 % Easy Median 22 % Hard Median of two Sorted Arrays
Problem description on LeetCode:
Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.
For example,
[2,3,4]
, the median is3
[2,3]
, the median is(2 + 3) / 2 = 2.5
Design a data structure that supports the following two operations:
void addNum(int num) - Add a integer number from the data stream to the data structure.
double findMedian() - Return the median of all elements so far.
Example:
Follow up:
If all integer numbers from the stream are between 0 and 100, how would you optimize it?
If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?
Analysis
寻找中位数median,这里有一种很巧妙的思路,需要利用Heap的半排序特性,是指root node的值是min (min heap)或者max(max heap)。
利用两个heap,一个minHeap,一个maxHeap,将nums[]的数顺序存入时,则可以分别存入maxHeap和minHeap,并保持这两个heap的size相同,保持两者size相同的操作通过minHeap.offer(maxHeap.poll());
和maxHeap.offer(minHeap.poll());
The basic idea is to maintain two heaps: a max-heap and a min-heap. The max heap stores the smaller half of all numbers while the min heap stores the larger half. The sizes of two heaps need to be balanced each time when a new number is inserted so that their size will not be different by more than 1. Therefore each time when findMedian() is called we check if two heaps have the same size. If they do, we should return the average of the two top values of heaps. Otherwise we return the top of the heap which has one more element. -- @hanhanbu
https://discuss.leetcode.com/topic/27506/easy-to-understand-double-heap-solution-in-java
Notice
LeetCode中与LintCode里稍有不同,在于对于Median的定义:
LeetCode要求是如果是even number,就计算中间两个数字的平均值,也就是
(maxHeap.peek() + (minHeap.peek()))/2
而LintCode的要求则是当even number时取[N/2]那一个,也就是说不论是否even number,需要返回的都是
maxHeap.peek()
延伸思考
这里建立Heap依然需要知道即将传入的nums[]的长度,如果对于一个未知长度的nums[]这里应当如何处理呢?
Solution
Double Heap (minHeap + maxHeap)
Reference
LeetCode Official Solution: https://leetcode.com/problems/find-median-from-data-stream/solution/
Last updated