Design a HashMap Low Level Design

🧠 Understanding HashMap in Java

1. What is a HashMap?

  • Definition: A HashMap is a part of Java's Collections Framework that stores data in key-value pairs.

  • Key Characteristics:

    • Unique Keys: Each key in a HashMap is unique.

    • Allows Nulls: Both keys and values can be null.

    • Unordered: The order of entries is not guaranteed.

2. Internal Structure

  • Bucket Array: Internally, a HashMap uses an array of buckets to store data.

  • Node Class: Each bucket contains a linked list (or a balanced tree in Java 8 and above) of Node objects, where each node represents a key-value pair.

 

static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
V value;
Node<K,V> next;
}


3. How HashMap Stores Data

  • Hashing: When a key-value pair is added:

    1. Calculate Hash: The hashCode() of the key is computed.

    2. Determine Bucket Index: The hash is used to determine the index in the bucket array.

    3. Handle Collisions: If multiple keys hash to the same index, they are stored in a linked list at that bucket index.

  • Collision Handling: If the number of entries in a bucket exceeds a certain threshold (default is 8), the bucket's linked list is converted into a balanced tree (Red-Black Tree) to improve performance.

4. Resizing the HashMap

  • Load Factor: The default load factor is 0.75, meaning the HashMap will resize when 75% of its capacity is filled.

  • Threshold: The threshold for resizing is calculated as capacity * loadFactor.

  • Resizing Process:

    1. Double Capacity: The capacity of the bucket array is doubled.

    2. Rehashing: All existing entries are rehashed and redistributed across the new bucket array.

5. Performance Considerations

  • Time Complexity:

    • Average Case: O(1) for put(), get(), and remove() operations.

    • Worst Case: O(n) if many keys hash to the same bucket, but this is mitigated by converting to a balanced tree when necessary.

  • Choosing Initial Capacity and Load Factor:

    • Initial Capacity: The number of buckets in the HashMap. A higher initial capacity can reduce the need for resizing.

    • Load Factor: Determines when to resize. A lower load factor reduces collisions but increases memory usage.









-----------------------------------------------------------------------------------------------------------------------------

2. Hashing & HashCode

Every key in Java has a hashCode() method:

String key = "A";

int hash = key.hashCode(); // generates an integer


  • HashMap uses this hash to calculate the bucket index:

index=hash%capacity\text{index} = \text{hash} \% \text{capacity}

  • Capacity: Number of buckets in the map (default 16).

  • If two keys have the same index, it’s called a collision.

3. Collision Handling (LinkedList)

  • If multiple keys map to the same index, we use a LinkedList (or a balanced tree in Java 8+ if too many collisions).

Example:

Capacity = 5

Keys: "A", "F", "K"

Hashcodes: A=65, F=70, K=75

Index = hash % 5 => 65%5=0, 70%5=0, 75%5=0


All 3 keys collide in bucket 0:


Bucket 0 -> [A=1] -> [F=2] -> [K=3] -> null

Bucket 1 -> null

Bucket 2 -> null

.

4. Limiting LinkedList & Why Resize Happens

  • Problem: If too many keys collide → LinkedList becomes long → O(n) instead of O(1)

  • Solution: Resize the array (increase capacity) → redistribute keys

Resize Trigger:

size>capacity×loadFactor\text{size} > \text{capacity} \times \text{loadFactor}

  • Load Factor: Default 0.75 in Java → 75% full triggers resizing

  • Why 0.75?

    • Balances memory and time complexity

    • Low load factor → more space wasted

    • High load factor → more collisions → slower

5. Resizing & Rehashing

  • When capacity exceeded:

    1. Double the capacity (capacity = 2 * oldCapacity)

    2. Rehash all existing keys → place them in new buckets

Old capacity = 4
Old buckets:
0 -> [A]
1 -> [B]
2 -> [C]
3 -> null

Size = 3, Load factor = 0.75
Threshold = 4*0.75 = 3 → Resize triggered
New capacity = 8
Rehash all keys:
Index = hash % 8


Node Structure (LinkedList in Bucket)


class Node<K, V> {
    final int hash; // cached hash of key
    final K key;
    V value;
    Node<K, V> next; // points to next in linked list

    Node(int hash, K key, V value, Node<K, V> next) {
        this.hash = hash;
        this.key = key;
        this.value = value;
        this.next = next;
    }
}

  • Each bucket can have multiple nodes in a linked list.

  • When we add a new key, we traverse the list:

    • If key exists: update value

    • If key not exists: add node at end 


  • 1. When Does HashMap Resize?

    HashMap resizes when the number of stored entries exceeds the threshold.

    Threshold=Capacity×Load Factor\text{Threshold} = \text{Capacity} \times \text{Load Factor}

    • Capacity = number of buckets (default 16)

    • Load Factor = fraction at which resizing should happen (default 0.75)

    • Threshold = maximum number of entries allowed before resizing

    👉 Example:


    Capacity = 16

    Load Factor = 0.75

    Threshold = 16 * 0.75 = 12


    ➡️ When the 13th entry is inserted, resize is triggered.


    2. How Does Resizing Work?

    When resize is triggered:

    1. Double the Capacity
      New capacity = old capacity × 2 (e.g., 16 → 32).

    2. Rehashing

      • Every key’s index is recalculated using:

        index=hash%newCapacity\text{index} = \text{hash} \% \text{newCapacity}
      • All nodes are redistributed into the new bucket array.

    3. Why Rehash?
      Because modulo operation depends on capacity.
      If capacity changes, bucket indexes will also change.




    HashMap<String, Integer> map = new HashMap<>(4, 0.75f);

    • Capacity = 4

    • Load Factor = 0.75

    • Threshold = 4 × 0.75 = 3

     Up to 3 entries → no resize
    ❌ When 4th entry is added → resize to capacity = 8 → rehash all keys



    8. Example Flow

    Keys: "A"=1, "F"=2, "K"=3
    Capacity: 5

    Step 1: Compute hash -> index

    "A" -> 65%5=0

    "F" -> 70%5=0

    "K" -> 75%5=0


    Step 2: Insert into bucket 0 linked list

    0 -> [A=1] -> [F=2] -> [K=3] -> null


    After exceeding load factor:

    • Resize array → redistribute keys

    • LinkedList lengths reduce → lookup faster


     Java Load Factor Default

    • 0.75 is default

    • Reason: Efficient memory + speed balance

    • If you want memory-efficient → increase (0.9)

    • If you want speed-efficient → decrease (0.5)




    package hashmap;

    public class MyHashMap<K, V> {

    // Inner Node class
    public class Node {
    K key;
    V value;
    Node next;

    public Node(K key, V value) {
    this.key = key;
    this.value = value;
    this.next = null;
    }
    }

    private Node[] buckets; // Generic array
    private float loadFactor;
    private int capacity;
    private int size;

    @SuppressWarnings("unchecked") // suppress generic array creation warning
    public MyHashMap() {
    this.capacity = 4;
    this.loadFactor = 0.75f;
    this.buckets = (Node[]) new Node[capacity]; // safe now
    this.size = 0;
    }

    // Hash function
    private int getBucketIndex(K key) {
    int hashcode = key.hashCode();
    return Math.abs(hashcode) % capacity;
    }

    // Put key-value
    public void put(K key, V value) {
    int index = getBucketIndex(key);
    Node head = buckets[index];

    // Check if key already exists
    while (head != null) {
    if (head.key.equals(key)) {
    head.value = value; // update
    return;
    }
    head = head.next;
    }

    // Insert new node at beginning
    Node newNode = new Node(key, value);
    newNode.next = buckets[index];
    buckets[index] = newNode;
    size++;

    // Resize if needed
    if ((float) size / capacity >= loadFactor) {
    resize();
    }
    }

    // Get value by key
    public V get(K key) {
    int index = getBucketIndex(key);
    Node head = buckets[index];

    while (head != null) {
    if (head.key.equals(key)) {
    return head.value;
    }
    head = head.next;
    }

    return null; // not found
    }

    // Resize array when load factor exceeded
    @SuppressWarnings("unchecked")
    private void resize() {
    int newCapacity = capacity * 2;
    Node[] newBuckets = (Node[]) new Node[newCapacity];

    for (int i = 0; i < capacity; i++) {
    Node head = buckets[i];
    while (head != null) {
    Node next = head.next;
    int index = Math.abs(head.key.hashCode()) % newCapacity;

    head.next = newBuckets[index];
    newBuckets[index] = head;

    head = next;
    }
    }

    buckets = newBuckets;
    capacity = newCapacity;
    }
    }


    if You get Generic Class Error then Try below Code :-


    package hashmap;
    import java.util.*;

    public class MyHashMap<K, V> { // declare K, V here

    // Inner Node class
    public class Node {
    K key;
    V value;
    Node next; //

    public Node(K key, V value) {
    this.key = key;
    this.value = value;
    this.next = null;
    }
    }

    private Object[] buckets;
    private float loadFactor;
    private int capacity;
    private int size;


    @SuppressWarnings("unchecked")
    public MyHashMap() {
    this.capacity = 4;
    this.loadFactor = 0.75f;
    this.buckets = new Object[capacity];
    this.size = 0;
    }


    private Node getNode(int index)
    {
    return (Node)buckets[index];
    }
    private void setNode(int index,Node node)
    {
    buckets[index]=node;
    }

    // Hash function: convert key -> index
    private int getBucketIndex(K key) {
    int hashcode = key.hashCode();
    return Math.abs(hashcode) % capacity;
    }

    // Example put method
    public void put(K key,V value)
    {
    int index=getBucketIndex(key);
    Node head=getNode(index);

    // Check if key exists
    while(head!=null)
    {
    if(head.key.equals(key))
    {
    head.value=value;
    return;
    }
    head=head.next;
    }

    //insert new node

    Node newNode=new Node(key,value);
    newNode.next=getNode(index);
    setNode(index,newNode);
    size++;

    // Check load factor
    if ((float) size / capacity >= loadFactor) {
    resize();
    }
    }

    private void resize() {
    int newCapacity = capacity * 2;
    Object[] newBuckets = new Object[newCapacity];

    for (int i = 0; i < capacity; i++) {
    Node head = getNode(i);
    while (head != null) {
    Node next = head.next;

    int newIndex = Math.abs(head.key.hashCode()) % newCapacity;
    head.next = (Node) newBuckets[newIndex];
    newBuckets[newIndex] = head;

    head = next;
    }
    }

    buckets = newBuckets;
    capacity = newCapacity;
    }

    // Example get method
    public V get(K key)
    {
    int index=getBucketIndex(key);
    Node head=getNode(index);

    while(head !=null)
    {
    if(head.key.equals(key))
    {
    return head.value;
    }
    head=head.next;
    }
    return null;
    }


    }


    package hashmap;

    public class Main {
    public static void main(String[] args) {
    MyHashMap<String, Integer> map = new MyHashMap<>();

    // Add key-value pairs
    map.put("A", 1);
    map.put("B", 2);
    map.put("C", 3);
    map.put("D", 4); // triggers resize (capacity doubles from 4 → 8)

    // Retrieve values
    System.out.println("A => " + map.get("A")); // 1
    System.out.println("B => " + map.get("B")); // 2
    System.out.println("C => " + map.get("C")); // 3
    System.out.println("D => " + map.get("D")); // 4
    System.out.println("E => " + map.get("E")); // null (not present)

    // Update a value
    map.put("B", 20);
    System.out.println("B => " + map.get("B")); // 20

    // Test resizing by adding more keys
    map.put("E", 5);
    map.put("F", 6);
    map.put("G", 7);
    map.put("H", 8); // triggers another resize

    System.out.println("After adding more keys:");
    System.out.println("E => " + map.get("E")); // 5
    System.out.println("H => " + map.get("H")); // 8
    }
    }



    ......

    Comments

    Popular posts from this blog

    Two Sum II - Input Array Is Sorted

    Comparable Vs. Comparator in Java

    Increasing Triplet Subsequence