Hash Table
The Principle of Hash Table
Hash Table is a data structure which organizes data using hash functions in order to support quick insertion and search.
The key idea of Hash Table is to use a hash function to map keys to buckets
Design a Hash Table
Hash Function
The hash function will depend on the range of key values and the number of buckets. The idea is to try to assign the key to the bucket as uniform as you can. Ideally, a perfect hash function will be a one-one mapping between the key and the bucket. However, in most cases a hash function is not perfect and it is a tradeoff between the amount of buckets and the capacity of a bucket.
Collision Resolution
A collision resolution algorithm should solve the following questions:
How to organize the values in the same bucket?
What if too many values are assigned to the same bucket?
How to search a target value in a specific bucket?
Let's assume that the bucket, which holds the maximum number of keys, has
N
keys.Typically, ifNis constant and small, we can simply use an
array
to store keys in the same bucket. IfNis variable or large, we might need to useheight-balanced binary search tree
instead.
Practical Application - Design the Key
Actually,designing a key
is tobuild a mapping relationship by yourself
between the original information and the actual key used by hash map. When you design a key, you need to guarantee that:
All values belong to the same group will be mapped in the same group.
Values which needed to be separated into different groups will not be mapped into the same group.
HashSet in Java
https://www.geeksforgeeks.org/hashset-in-java/
NOTE: The implementation in a HashSet is not synchronized, in the sense that if multiple threads access a hash set concurrently, and at least one of the threads modifies the set, it must be synchronized externally. This is typically accomplished by synchronizing on some object that naturally encapsulates the set. If no such object exists, the set should be “wrapped” using the Collections.synchronizedSet method. This is best done at creation time, to prevent accidental unsynchronized access to the set as shown below:
Constructors in HashSet:
Default initial capacity is 16 and default load factor is 0.75.
default loadFactor of 0.75
LinkedHashMap
https://www.baeldung.com/java-linked-hashmap
Performance
Just like HashMap
, LinkedHashMap
_performs the basic Map
_operations of add, remove and contains in constant-time, as long as the hash function is well-dimensioned. It also accepts a null key as well as null values.
However, this constant-time performance of LinkedHashMap
is likely to be a little worse than the constant-time of _HashMap
_due to the added overhead of maintaining a doubly-linked list.
Iteration over collection views of LinkedHashMap also takes linear time O(n) similar to that of HashMap. On the flip side, LinkedHashMap
‘s linear time performance during iteration is better than HashMap
‘s linear time.
This is because, for LinkedHashMap, n in O(n) is only the number of entries in the map regardless of the capacity. Whereas, for HashMap, n is capacity and the size summed up, O(size+capacity).
Load Factor and Initial Capacity are defined precisely as for HashMap. Note, however, that the penalty for choosing an excessively high value for initial capacity is less severe for LinkedHashMap than for HashMap, as iteration times for this class are unaffected by capacity.
Concurrency
Just like HashMap
, LinkedHashMap
_ _implementation is not synchronized. So if you are going to access it from multiple threads and at least one of these threads is likely to change it structurally, then it must be externally synchronized.
Use Case:
LRU Cache:
Last updated