Top K Frequent Words II
Data Stream, HashMap, TreeSet
Description
Find top_k _frequent words in realtime data stream.
Implement three methods for Topk Class:
- TopK(k). The constructor.
- add(word). Add a new word.
- topk(). Get the current top _k _frequent words.
If two words have the same frequency, rank them by alphabet.
Example
TopK(2)
add("lint")
add("code")
add("code")
topk()
>> ["code", "lint"]Solution
Jiuzhang's Solution: HashMap + TreeSet
注意:
- 这里add()时,如果TreeSet中存在这个word,要先删去该word,更新HashMap中word对应的count,再将其加入TreeSet,这样就可以利用更新过的count来排序了。否则TreeSet中已经存在的word没法改变其排序。 
- 这里的TreeSet用的是根据words count来排序,并且count越高排序越靠前,因此可以想成(用PriorityQueue)的Max-Heap,与PriorityQueue相对应的,这里TreeSet因为可以直接用 - pollLast()删去last(highest) element,也就是移除最小的word count,因此这里可以使用Max-Heap的思想而不是Min-Heap。
/**
* This reference program is provided by @jiuzhang.com
* Copyright is reserved. Please indicate the source for forwarding
*/
import java.util.NavigableSet;
public class TopK {
    private Map<String, Integer> words = null;
    private NavigableSet<String> topk = null;
    private int k;
    private Comparator<String> myComparator = new Comparator<String>() {
        public int compare(String left, String right) {
            if (left.equals(right))
                return 0;
            int left_count = words.get(left);
            int right_count = words.get(right);
            if (left_count != right_count) {
                return right_count - left_count;
            }
            return left.compareTo(right);
        }
    };
    public TopK(int k) {
        // initialize your data structure here
        this.k = k;
        words = new HashMap<String, Integer>();
        topk = new TreeSet<String>(myComparator);
    }
    public void add(String word) {
        // Write your code here
        if (words.containsKey(word)) {
            if (topk.contains(word))
                topk.remove(word);
            words.put(word, words.get(word) + 1);
        } else {
            words.put(word, 1);
        }
        topk.add(word);
        if (topk.size() > k) {
            topk.pollLast();
        }
    }
    public List<String> topk() {
        // Write your code here
        List<String> results = new ArrayList<String>();
        Iterator it = topk.iterator();
        while(it.hasNext()) {
             String str = (String)it.next();
             results.add(str);
        }
        return results;
    }
}Last updated
Was this helpful?