Wei Jie's Lectures Notes

❯

❯

Algorithm Analysis and Design

❯

Lecture 02 Hash Tables

Lecture 02 Hash Tables

Dec 16, 20254 min read

Hash Table ADT

A hash table is a table of elements with keys.
A hash function locates the position of a key in the table.
Search for an element can be done in Θ(1) time.

Selected Hash Table ADT Operations

insert: Insert an element into the table.
retrieve: Retrieve an element from the table.
An operation to empty out the hash table.

Hash Functions

Input: A key value.
Output: An index of an array (hash table) where the object containing the key is located.
Example:
h(k) = k % table_size

Example Using a Hash Function

Hash function: h(k) = k % 100
Search for key 214:
- k = 214
- Result: 14
- Object is stored at index 14 of the array.
- Search is done in Θ(1) time.

Inserting an Element

Use the same hash function: h(k) = k % 100.
For key 214, the object is stored at index 14.
Insertion is done in Θ(1) time.

Big Picture Comparison

Linear search: O(n) (e.g., 500,000 comparisons for 1M keys).
Binary search: O(lg n) (e.g., ~19 comparisons for 500,000 keys).
Hash table: 1 comparison vs. 19 (binary) vs. 500,000 (linear).

Collisions

Occur when two keys hash to the same index.
Example:
- h(k) = k % 100
- Keys 393 and 193 both hash to 93.
Resolution methods: Chaining, linear probing, etc.

Collision Resolution: Chaining

Use an array of linked lists.
Hash function provides the index of the linked list.
Insert at the front of the linked list.
Java’s HashSet and HashMap use chaining.

Example Using Chaining

Hash function: h(k) = k % 7
Insert keys: 31, 9, 36, 42, 46, 20, 2, 24.
Collision occurs for key 2 (index 2 already occupied by 9).
Insert 2 at the front of the linked list at index 2.

Clustering in Chaining

Some linked lists are long; others are empty.
Worst-case search time: O(n) (max chain length).

Open Addressing

Store elements directly in the array (no linked lists).
Saves memory.
Examples: Linear probing, quadratic probing.

Collision Resolution: Linear Probing

On collision, place the element in the next free slot.
Example: Collision at index 5, place at 6, 7, etc.

Problem with Linear Probing

Inserting 56 may require probing multiple slots (e.g., 16, 17, 18, 19).

Clustering in Linear Probing

Consecutive slots may be occupied or empty.
Worst-case search time: O(n) (array length).

Linear Probing Improvement: Quadratic Probing

Move j^2 cells from the collision point, where j is the attempt number.
Limitation: May not find an empty cell if the array is half full.

Example of Quadratic Probing

Hash function: f(k) = k % 10
Insert keys: 27, 17, 37, 47, 48, 57.
- 17: Collision at 7, probe 7 + 1^2 = 8.
- 37: Collision at 7, probe 7 + 2^2 = 11 → 1.
- 57: Collision at 7, probe 7 + 4^2 = 23 → 3.

Chaining vs. Linear Probing

Chaining: Extra memory for linked lists.
Linear Probing: Fixed memory, better for caching.
Clustering: Worse search time for linear probing with many collisions.
Load factor: Linear probing is better if < 0.85.

Uniform Hashing

Elements are spread evenly among indexes.
Allows Θ(1) search time for both chaining and open addressing.
Miss in open addressing: O(n).

Ideal Hash Function for Uniform Hashing

Choose a prime number table size not close to a power of 2.
- Example: 97 (not 31).
Hash function: h(k) = k % 97.

Ideal Hash Tables

No collisions: Use h(k) = k (unique keys).
- Example: 300 employees with 4-digit IDs → table size 10,000 (97% empty).
No collisions or empty slots:
- Example: 300 employees with IDs 0-299 → table size 300, h(k) = k.

Speed vs. Memory Conservation

Speed: Large table (no collisions) → fastest.
Memory: Minimize empty slots → most efficient.

Hash Table Design

Decide priority: speed or memory conservation.
Choose table size:
- Allows a good hash function.
- Balances speed and memory.

Time Complexities

insert: Θ(1) (insert at head of linked list).
retrieve: Θ(1) for uniform hashing (bounded chain length).

Graph View

Hash Table ADT
Selected Hash Table ADT Operations
Hash Functions
Example Using a Hash Function
Inserting an Element
Big Picture Comparison
Collisions
Collision Resolution: Chaining
Clustering in Chaining
Open Addressing
Collision Resolution: Linear Probing
Clustering in Linear Probing
Linear Probing Improvement: Quadratic Probing
Chaining vs. Linear Probing
Uniform Hashing
Ideal Hash Function for Uniform Hashing
Ideal Hash Tables
Speed vs. Memory Conservation
Hash Table Design
Time Complexities

Backlinks

Algorithm Analysis and Design

Created with Quartz v4.5.2 © 2025