Hash Table Operations (Insert, Delete, Search)
Hash Table Operations: Put, Get, and Delete
The true power of hash tables lies in their ability to perform core operations – insertion, retrieval, and deletion – with remarkable efficiency. While we've discussed the theory of hashing and collision resolution, let's now examine how these principles translate into the mechanics of these essential operations. The goal for all these operations is an average time complexity of O(1).
1. Insertion (put or set)
The put (or set) operation is used to add a new key-value pair to the hash table or update the value associated with an existing key.
-
Purpose: To store data that can be quickly retrieved later using its unique key.
-
Steps:
- Hash the Key: The provided
keyis passed to the hash function. This function computes an integer hash code. - Calculate Index: The hash code is then usually put through a modulo operation (e.g.,
hashCode % arraySize) to get an index that falls within the bounds of the hash table's underlying array (its "buckets"). - Check for Existing Key (Update): At this calculated index, the hash table checks if the
keyalready exists.- If the
keyis found, its associatedvalueis updated.
- If the
- Handle Collision (Insert New): If the
keyis not found at the initial index, or if the index is already occupied by a different key (a collision):- Chaining: The new key-value pair is added to the linked list (or other structure) at that bucket's index.
- Open Addressing: The hash table follows its probing sequence (linear, quadratic, double hashing) to find the next available empty slot, where the key-value pair is then inserted.
- Update Size & Check Load Factor: The number of elements in the hash table is incremented. If the
load factor(number of elements / number of buckets) exceeds a predefined threshold (e.g., 0.7), the hash table typically triggers a rehashing operation to resize itself.
- Hash the Key: The provided
-
Time Complexity:
- Average Case:
O(1). This is because hashing and direct access to the bucket take constant time. If collisions are minimal, the operations within the bucket (adding to a short list or finding an empty slot) are also constant time. - Worst Case:
O(N). This can occur if:- All
Nkeys hash to the same bucket (in chaining, requiring traversal of anN-length linked list). - The table becomes very full, leading to extremely long probe sequences (in open addressing).
- A
rehashingoperation is triggered, which involves re-inserting allNelements into a new, larger table.
- All
- Average Case:
2. Retrieval (get)
The get operation is used to find and return the value associated with a given key.
-
Purpose: To quickly look up data based on its key.
-
Steps:
- Hash the Key: The
keyis passed to the hash function to compute its hash code. - Calculate Index: The hash code is used to determine the exact
indexin the underlying array. - Search at Index: The hash table goes directly to this
indexand searches for thekey:- Chaining: It traverses the linked list at that bucket, comparing the input
keywith thekeyof each node until a match is found. - Open Addressing: It follows the same probing sequence used during insertion, searching for the
key. The search stops if thekeyis found, or if an empty slot is encountered (meaning the key is not in the table).
- Chaining: It traverses the linked list at that bucket, comparing the input
- Return Value: If the
keyis found, its associatedvalueis returned. Otherwise,null,undefined, or an appropriate indicator is returned.
- Hash the Key: The
-
Time Complexity:
- Average Case:
O(1). Similar to insertion, fast hash calculation and minimal collision resolution make it constant time. - Worst Case:
O(N). If all keys are in one long collision chain (chaining) or require traversing almost the entire table (open addressing), retrieval becomes linear.
- Average Case:
3. Deletion (delete)
The delete operation removes a key-value pair from the hash table.
-
Purpose: To remove data and free up its associated storage.
-
Steps:
- Hash the Key & Calculate Index: The process begins identically to
getandput, hashing thekeyto find its potentialindex. - Locate Key: The hash table navigates to that
indexand searches for thekeyusing the appropriate collision resolution method (traversing a linked list for chaining, or following a probe sequence for open addressing). - Remove Pair:
- Chaining: Once the matching
keyis found within the linked list at that bucket, the node containing the key-value pair is simply removed from the list (standard linked list deletion). - Open Addressing: This is more complex. Simply emptying the slot could break the probe sequence for other elements that hashed to the same initial index but were placed further down. To avoid this, the slot is typically marked with a special "deleted" flag (sometimes called a tombstone). The
deletedslot is ignored during lookups but can be reused for future insertions.
- Chaining: Once the matching
- Update Size: The number of elements in the hash table is decremented.
- Hash the Key & Calculate Index: The process begins identically to
-
Time Complexity:
- Average Case:
O(1). - Worst Case:
O(N). Similar reasons asputandget.
- Average Case:
4. The Amortized O(1) of Rehashing
While the primary operations aim for O(1), it's crucial to understand how rehashing (resizing the table) fits in. Put operations, in particular, can trigger rehashing if the load factor becomes too high.
- Impact: A rehashing operation involves creating a new, larger array and then re-inserting all existing
Nelements into this new table. This is anO(N)process. - Amortized Analysis: Despite this
O(N)occasional cost, hash tables are still said to haveO(1)average time complexity. This is because rehashing happens infrequently (e.g., only when the table doubles in size), and its cost is "amortized" or spread out over manyO(1)operations. The total cost ofMoperations, including some rehashes, is proportional toM, making the average cost per operationO(1).
Key Takeaway: Hash table operations (
put,get,delete) achieve remarkable O(1) average-case time complexity by leveraging hash functions to quickly locate data. While collisions and periodic rehashing can lead toO(N)worst-case scenarios, the overall amortized performance remains constant time, making hash tables incredibly efficient for dynamic key-value storage.

