Hash Table

The Principle of Hash Table

Hash Table is a data structure which organizes data using hash functions in order to support quick insertion and search.

The key idea of Hash Table is to use a hash function to map keys to buckets

Design a Hash Table

Hash Function

The hash function will depend on the range of key values and the number of buckets. The idea is to try to assign the key to the bucket as uniform as you can. Ideally, a perfect hash function will be a one-one mapping between the key and the bucket. However, in most cases a hash function is not perfect and it is a tradeoff between the amount of buckets and the capacity of a bucket.

Collision Resolution

A collision resolution algorithm should solve the following questions:

  1. How to organize the values in the same bucket?

  2. What if too many values are assigned to the same bucket?

  3. How to search a target value in a specific bucket?

Let's assume that the bucket, which holds the maximum number of keys, hasNkeys.

Typically, ifNis constant and small, we can simply use anarrayto store keys in the same bucket. IfNis variable or large, we might need to useheight-balanced binary search treeinstead.

Practical Application - Design the Key

Actually,designing a keyis tobuild a mapping relationship by yourselfbetween the original information and the actual key used by hash map. When you design a key, you need to guarantee that:

  1. All values belong to the same group will be mapped in the same group.

  2. Values which needed to be separated into different groups will not be mapped into the same group.

HashSet in Java

https://www.geeksforgeeks.org/hashset-in-java/

                  Number of stored elements in the table
   load factor = -----------------------------------------
                        Size of the hash table

NOTE: The implementation in a HashSet is not synchronized, in the sense that if multiple threads access a hash set concurrently, and at least one of the threads modifies the set, it must be synchronized externally. This is typically accomplished by synchronizing on some object that naturally encapsulates the set. If no such object exists, the set should be “wrapped” using the Collections.synchronizedSet method. This is best done at creation time, to prevent accidental unsynchronized access to the set as shown below:

Set s = Collections.synchronizedSet(new HashSet(...));

Constructors in HashSet:

HashSet h = new HashSet();

Default initial capacity is 16 and default load factor is 0.75.

HashSet h = new HashSet(int initialCapacity);

default loadFactor of 0.75

HashSet h = new HashSet(int initialCapacity, float loadFactor);
HashSet h = new HashSet(Collection C);

LinkedHashMap

https://www.baeldung.com/java-linked-hashmap

Performance

Just like HashMap, LinkedHashMap _performs the basic Map _operations of add, remove and contains in constant-time, as long as the hash function is well-dimensioned. It also accepts a null key as well as null values.

However, this constant-time performance of LinkedHashMap is likely to be a little worse than the constant-time of _HashMap _due to the added overhead of maintaining a doubly-linked list.

Iteration over collection views of LinkedHashMap also takes linear time O(n) similar to that of HashMap. On the flip side, LinkedHashMap‘s linear time performance during iteration is better than HashMap‘s linear time.

This is because, for LinkedHashMap, n in O(n) is only the number of entries in the map regardless of the capacity. Whereas, for HashMap, n is capacity and the size summed up, O(size+capacity).

Load Factor and Initial Capacity are defined precisely as for HashMap. Note, however, that the penalty for choosing an excessively high value for initial capacity is less severe for LinkedHashMap than for HashMap, as iteration times for this class are unaffected by capacity.

Concurrency

Just like HashMap, LinkedHashMap_ _implementation is not synchronized. So if you are going to access it from multiple threads and at least one of these threads is likely to change it structurally, then it must be externally synchronized.

Map m = Collections.synchronizedMap(new LinkedHashMap());

Use Case:

LRU Cache:

    import java.util.LinkedHashMap;
    public class LRUCache {
        private LinkedHashMap<Integer, Integer> map;
        private final int CAPACITY;
        public LRUCache(int capacity) {
            CAPACITY = capacity;
            map = new LinkedHashMap<Integer, Integer>(capacity, 0.75f, true){
                protected boolean removeEldestEntry(Map.Entry eldest) {
                    return size() > CAPACITY;
                }
            };
        }
        public int get(int key) {
            return map.getOrDefault(key, -1);
        }
        public void set(int key, int value) {
            map.put(key, value);
        }
    }

Last updated